searching one list for values from another list.

Welcome Forums General PowerShell Q&A searching one list for values from another list.

This topic contains 4 replies, has 3 voices, and was last updated by

 
Participant
3 weeks, 2 days ago.

  • Author
    Posts
  • #179175

    Participant
    Topics: 26
    Replies: 57
    Points: 123
    Rank: Participant

    first, always good to find your own forum posts when googling for how to do stuff

    https://powershell.org/forums/topic/pipe-multiple-foreach-resutls-into-a-two-column-array/

    that was the second part of a bigger question i was answering today. basically i'm pulling an array of $sources with a 'path' column whose values look like 'share/folder/subfolder/' (yes, with those slash directions) from one location, and and array of $allstats from a mysql query, which has a "sourcefilename" column with a similar path as $sources. i just want a count of how many times each $source.path appears in $allstats. i just have to do a little manipulation with the slash directions.

    i came up with three ways to do this query. version 1 was even before the previous paragraph, i was looping through $sources and re-querying the mysql database repeatedly for each $source.path. pretty slow. then i figured it'd be faster to pull the whole mysql query into powershell once, and then loop through that PS array looking for each $source. so that array is $allstats.

    version 2, i foreach through each $source, and nest another foreach going through each line in $allstats to see if it matches, then increment a counter if they do.

    version 3, i do ($allstats | where-object -blah blah).count.

    (hoping this gist link works)

     

    v3 looks like a cleaner way to do the same thing as v2, but but v3 takes about 30% longer to run, according to measure-command.  which made me wonder why, and made me wonder if there's a way to do it that's 30% faster than v2.

     

  • #179418

    Senior Moderator
    Topics: 8
    Replies: 1085
    Points: 3,651
    Helping Hand
    Rank: Community Hero

    how about

    # use below instead of the internal foreach in V1
    $allstat.sourcefilename.Where({$_ -like "*$sourcepath*"}).count
    
  • #179427

    Participant
    Topics: 26
    Replies: 57
    Points: 123
    Rank: Participant

    @kvprasoon,i may be confused. the code in the OP was v2 and v3, there was no V1... and i'm not sure where to drop your code into my loops. i tried it one way, and it ended up running longer than either of the previous two versions, and i stopped it. the resulting $array only had two lines in it at that point.

  • #179436

    Participant
    Topics: 0
    Replies: 113
    Points: 420
    Helping Hand
    Rank: Contributor

    Hey John,

    I believe Kvprasoon was meaning for V2 not V1. And he is referring to this below with my interpretation.

    Measure-Command {
        foreach ($source in $sources)
        {
            $array += [pscustomobject]@{
            count = $allstat.sourcefilename.Where({$_ -like "*$sourcepath*"}).count
            path = $source.path}
        }
    }

     

  • #180246

    Participant
    Topics: 26
    Replies: 57
    Points: 123
    Rank: Participant

    so to clear something up... $allstatS is the array, $allstat is one row in a foreach.

    so since your suggestions are not foreaching through allstats, i think you guys are meaning:

    $allstatS.sourcefilename.Where({$_ -like "*$sourcepath*"}).count

    v2 takes about 50 seconds, v3 takes about 75 seconds. if i let either of the two below run for three minutes, $array is only about 25% complete.

    one thing i found that sped up v2 even more was reducing the size of $allstats after every source with

    $allstats = $allstats -notlike "$sourcepath"

    so the next loop didn't have to parse through all the $allstats rows that i'd already counted. that knocked it down to 30 seconds.

     

    so at the core of this is how we're parsing $allstats. so i just took out those commands and measured them for one $sourcepath.

    the v2 foreach($allstat in $allstats){blah blah...count++} loop takes 3 seconds

    the v3 ($allstats | ? sourcefilename -like "*$sourcepath*").count takes 6 seconds

    ($allstats.sourcefilename.Where({$_ -like "*$sourcepath*"}).count takes 46 seconds.

     

You must be logged in to reply to this topic.