Fastest Way to Return Names in array1 that are not in array2

This topic contains 5 replies, has 4 voices, and was last updated by  Max Kozlov 4 months ago.

  • Author
    Posts
  • #77442

    DJ
    Participant

    What is the fastest way to return elements in array1 not found in array2?

    Array1 has like 5000 entries
    Array2 has like 18000 entries
    Each Array only has 1 value and its name, so not able to use hash keypairs.
    Arrays are not sorted

    Provide simple example of best option you recommend.

  • #77445

    Don Jones
    Keymaster

    Compare-Object. See help for examples.

  • #77451

    Sam Boutros
    Participant

    what he said 🙂

    $Array1 = 1..5000 | % { [PSCustomObject]@{
            bla1 = $_
            bla2 = "Text$_"
        }
    }
    
    $Array2 = 3000..20000 | % { [PSCustomObject]@{
            bla1 = $_
            bla2 = "Text$_"
        }
    }
    
    # Elements in Array1 not in Array2 ==> 1 - 2999
    
    $Duration = Measure-Command {
        $myOutput = Compare-Object -ReferenceObject $Array1 -DifferenceObject $Array2 -PassThru -Property bla1 | 
            Where-Object { $_.SideIndicator -eq '< =' }
    }
    'Test 1: Compare-Object and Where-Object'
    "    Identified $($myOutput.count) records in $($Duration.TotalSeconds) seconds"
    
    $Duration = Measure-Command {
        $myOutput = (Compare-Object -ReferenceObject $Array1 -DifferenceObject $Array2 -PassThru -Property bla1).Where{ $_.SideIndicator -eq '<=' } 
    }
    'Test 2: Compare-Object and Where() method'
    "    Identified $($myOutput.count) records in $($Duration.TotalSeconds) seconds"
    
    $Duration = Measure-Command {
        $myOutput = $Array1 | % {
            if ($_.bla1 -notin $Array2.bla1 ) { $_ }
        } 
    }
    'Test 3: Enumerate Array1 elements, test each against Array2'
    "    Identified $($myOutput.count) records in $($Duration.TotalSeconds) seconds"
    
    Test 1: Compare-Object and Where-Object
        Identified 2999 records in 11.8748938 seconds
    Test 2: Compare-Object and Where() method
        Identified 2999 records in 10.6478435 seconds
    Test 3: Enumerate Array1 elements, test each against Array2
        Identified 2999 records in 48.2678471 seconds
    
  • #77458

    DJ
    Participant

    Thanks for the quick responses. I've tried some of these options, but still very slow. I will measure it and report my results and code examples. Also, I can't use hash table as my 2 arrays only have 1 property in them, therefore can't meet keypair option requirements for hashtable.

  • #77496

    DJ
    Participant

    Tried doing CompareObject on 2 results and it works good.

    However, I this error about null, but I don't see whats wrong. The arrays are not null. If I run the contents in Reference Object it works fine, same if I do it for DifferenceObject. After much troubleshooting, found that in my arrays there was 2 null lines, and apparently Compare-Object doesn't work ANY null entries.

    Code to replace the null entries used:
    $ruleApplications = $ruleApplications -replace "^$", "null"

    Thanks for ALL of the help, much appreciated.

  • #77568

    Max Kozlov
    Participant

    if you do not have duplicates in array why you can't use hashtables ?
    what stop you from use $hash[$key] = 1 and $hash.ContainsKey($key) ?
    and even there is a duplicates you can have a $hash[$key] = $count

    also there is a HashSet
    https://msdn.microsoft.com/en-us/library/bb359438(v=vs.110).aspx

You must be logged in to reply to this topic.