Welcome › Forums › General PowerShell Q&A › Help cleaning an array of ‘duplicates’
- This topic has 3 replies, 3 voices, and was last updated 3 months, 2 weeks ago by
Participant.
-
AuthorPosts
-
-
October 14, 2020 at 3:18 pm #263373
I have a script that generates an array that has output as shown below:
[test group2] contains [test group1]
[test group3] contains [test group2]
[test group1] contains [test group2]
[test group2] contains [test group3]The lines such as [test group2] contains [test group1] and [test group1] contains [test group2] are essentially duplicates, just in a different order. I really only need one or the other, not important which I keep.
How could I go about clearing out any duplicates from the array?
-
October 14, 2020 at 3:45 pm #263385
Hmmm… if [test group1] = [test group2] then
[test group1] contains [test group2] = [test group2] contains [test group1]but if [test group1] != [test group2] then
[test group1] contains [test group2] != [test group2] contains [test group1]I’m not really following the logic here…
-
October 14, 2020 at 4:09 pm #263394
So the same would be true for [test group2] contains [test group3] and [test group3] contains [test group2], correct? If so you could just use a switch utilizing regex with a list and only add to the list if both don’t already exist in a line. It’s a lot easier to show than express with words so please see the example.
To simulate a Get-Content $somefile we’ll use a here-string split at new lines. Then we’ll prepare an empty list and finally process each line through the switch.
PowerShell1234567891011121314151617$output = @'[test group2] contains [test group1][test group3] contains [test group2][test group1] contains [test group2][test group2] contains [test group3]'@ -split [environment]::NewLine$result = [System.Collections.Generic.List[string]]::new()switch -Regex ($output){'\[(.+)\].+\[(.+)\]'{if( -not ($result | Where {$_.contains($Matches.1) -and $_.contains($Matches.2)})){$result.Add($_)}}}And what ends up in $result are the first two that weren’t duplicated.
PowerShell1234$result[test group2] contains [test group1][test group3] contains [test group2]The switch statement can also read directly from files. See the following expanded example.
PowerShell123456789101112131415161718192021222324252627282930313233343536$tempfile = New-TemporaryFile@'[test group2] contains [test group1][test group3] contains [test group2][test group1] contains [test group2][test group2] contains [test group3][test group2] contains [test group8][test group3] contains [test group7][test group1] contains [test group8][test group4] contains [test group7][test group7] contains [test group4][test group7] contains [test group3][test group7] contains [test group1]'@ | Set-Content $tempfile -Encoding UTF8$result = [System.Collections.Generic.List[string]]::new()switch -Regex -File ($tempfile){'\[(.+)\].+\[(.+)\]'{if( -not ($result | Where {$_.contains($Matches.1) -and $_.contains($Matches.2)})){$result.Add($_)}}}$result[test group2] contains [test group1][test group3] contains [test group2][test group2] contains [test group8][test group3] contains [test group7][test group1] contains [test group8][test group4] contains [test group7][test group7] contains [test group1] -
October 14, 2020 at 4:30 pm #263400
This works perfectly!
Thank you!
-
-
AuthorPosts
- The topic ‘Help cleaning an array of ‘duplicates’’ is closed to new replies.