Fastest Way to Return Names in array1 that are not in array2

Welcome Forums General PowerShell Q&A Fastest Way to Return Names in array1 that are not in array2

This topic contains 5 replies, has 4 voices, and was last updated by

 
Participant
1 year, 3 months ago.

  • Author
    Posts
  • #77442
    DJ

    Participant
    Points: 0
    Rank: Member

    What is the fastest way to return elements in array1 not found in array2?

    Array1 has like 5000 entries
    Array2 has like 18000 entries
    Each Array only has 1 value and its name, so not able to use hash keypairs.
    Arrays are not sorted

    Provide simple example of best option you recommend.

  • #77445

    Keymaster
    Points: 1,704
    Helping HandTeam Member
    Rank: Community Hero

    Compare-Object. See help for examples.

  • #77451

    Participant
    Points: 85
    Rank: Member

    what he said 🙂

    $Array1 = 1..5000 | % { [PSCustomObject]@{
            bla1 = $_
            bla2 = "Text$_"
        }
    }
    
    $Array2 = 3000..20000 | % { [PSCustomObject]@{
            bla1 = $_
            bla2 = "Text$_"
        }
    }
    
    # Elements in Array1 not in Array2 ==> 1 - 2999
    
    $Duration = Measure-Command {
        $myOutput = Compare-Object -ReferenceObject $Array1 -DifferenceObject $Array2 -PassThru -Property bla1 | 
            Where-Object { $_.SideIndicator -eq '< =' }
    }
    'Test 1: Compare-Object and Where-Object'
    "    Identified $($myOutput.count) records in $($Duration.TotalSeconds) seconds"
    
    $Duration = Measure-Command {
        $myOutput = (Compare-Object -ReferenceObject $Array1 -DifferenceObject $Array2 -PassThru -Property bla1).Where{ $_.SideIndicator -eq '<=' } 
    }
    'Test 2: Compare-Object and Where() method'
    "    Identified $($myOutput.count) records in $($Duration.TotalSeconds) seconds"
    
    $Duration = Measure-Command {
        $myOutput = $Array1 | % {
            if ($_.bla1 -notin $Array2.bla1 ) { $_ }
        } 
    }
    'Test 3: Enumerate Array1 elements, test each against Array2'
    "    Identified $($myOutput.count) records in $($Duration.TotalSeconds) seconds"
    
    Test 1: Compare-Object and Where-Object
        Identified 2999 records in 11.8748938 seconds
    Test 2: Compare-Object and Where() method
        Identified 2999 records in 10.6478435 seconds
    Test 3: Enumerate Array1 elements, test each against Array2
        Identified 2999 records in 48.2678471 seconds
    
  • #77458
    DJ

    Participant
    Points: 0
    Rank: Member

    Thanks for the quick responses. I've tried some of these options, but still very slow. I will measure it and report my results and code examples. Also, I can't use hash table as my 2 arrays only have 1 property in them, therefore can't meet keypair option requirements for hashtable.

  • #77496
    DJ

    Participant
    Points: 0
    Rank: Member

    Tried doing CompareObject on 2 results and it works good.

    However, I this error about null, but I don't see whats wrong. The arrays are not null. If I run the contents in Reference Object it works fine, same if I do it for DifferenceObject. After much troubleshooting, found that in my arrays there was 2 null lines, and apparently Compare-Object doesn't work ANY null entries.

    Code to replace the null entries used:
    $ruleApplications = $ruleApplications -replace "^$", "null"

    Thanks for ALL of the help, much appreciated.

  • #77568

    Participant
    Points: 0
    Rank: Member

    if you do not have duplicates in array why you can't use hashtables ?
    what stop you from use $hash[$key] = 1 and $hash.ContainsKey($key) ?
    and even there is a duplicates you can have a $hash[$key] = $count

    also there is a HashSet
    https://msdn.microsoft.com/en-us/library/bb359438(v=vs.110).aspx

The topic ‘Fastest Way to Return Names in array1 that are not in array2’ is closed to new replies.