Combining data – how to also add data that is in set2 and not set1?

Welcome Forums General PowerShell Q&A Combining data – how to also add data that is in set2 and not set1?

Viewing 5 reply threads
  • Author
    Posts
    • #193120
      Participant
      Topics: 1
      Replies: 1
      Points: 14
      Rank: Member

      Hello!

      I am brand new to PowerShell (started yesterday). I’m trying to combine two sets of data into a single array using a common variable (name in example below). The issue I am running into is that values may appear on one set and not the other, but I still need them to show up.

      I’ve been able to get it so that values from $CSVData01 that aren’t on $CSVData02 appear, but I am struggling with also getting values on $CSVData02 that aren’t on $CSVData01 to appear. Here’s what I have with dummy data:

      $CSVData01 = @’
      “name”,”start_date”,”end_date”
      Ashleigh,1/5/2019,4/5/2019
      Hunter,5/8/2018,8/8/2018
      Steve,6/25/2019,9/25/2019
      Jason,11/9/2015,2/9/2016
      Heather,3/22/2018,6/22/2018
      Sarah,7/16/2019,10/16/2019
      Jane,1/1/2017,4/1/2017
      ‘@ | Convertfrom-Csv
      
      $CSVData02 = @’
      “name”,”rehire”,”bonus”
      Ashleigh,5/8/2019,500.00
      Hunter,8/25/2019,250.00
      Steve,10/22/2019,100.00
      Heather,7/13/2019,600.00
      Sarah,10/5/2019,50.00
      Alice,5/5/2019/6000.00
      Andy,1/8/2015,2.00
      ‘@ | Convertfrom-Csv
      
      $outarray = @(
      foreach ($DataSet01 in $CSVData01) {
      $DataSet02 = $CSVData02 | Where-Object {$DataSet01.name -eq $_.name}
      [PSCustomObject]@{
      name = $DataSet01.name
      start_date = $DataSet01.start_date
      end_date = $DataSet01.end_date
      rehire = $DataSet02.rehire
      bonus = $DataSet02.bonus
      }
      }
      )
      $outarray | Write-Output

      Here is the output I’m getting

      name : Ashleigh
      start_date : 1/5/2019
      end_date : 4/5/2019
      rehire : 5/8/2019
      bonus : 500.00

      name : Hunter
      start_date : 5/8/2018
      end_date : 8/8/2018
      rehire : 8/25/2019
      bonus : 250.00

      name : Steve
      start_date : 6/25/2019
      end_date : 9/25/2019
      rehire : 10/22/2019
      bonus : 100.00

      name : Jason
      start_date : 11/9/2015
      end_date : 2/9/2016
      rehire :
      bonus :

      name : Heather
      start_date : 3/22/2018
      end_date : 6/22/2018
      rehire : 7/13/2019
      bonus : 600.00

      name : Sarah
      start_date : 7/16/2019
      end_date : 10/16/2019
      rehire : 10/5/2019
      bonus : 50.00

      name : Jane
      start_date : 1/1/2017
      end_date : 4/1/2017
      rehire :
      bonus :

      How do I get Alice and Andy also to appear on this output?

      Thanks so much!

    • #193138
      Participant
      Topics: 5
      Replies: 2411
      Points: 6,193
      Helping Hand
      Rank: Community MVP

      You have a typo in your $CSVData02. 😉

      $CSVData01 = @'
      "name","start_date","end_date"
      Ashleigh,1/5/2019,4/5/2019
      Hunter,5/8/2018,8/8/2018
      Steve,6/25/2019,9/25/2019
      Jason,11/9/2015,2/9/2016
      Heather,3/22/2018,6/22/2018
      Sarah,7/16/2019,10/16/2019
      Jane,1/1/2017,4/1/2017
      '@ | Convertfrom-Csv
      
      $CSVData02 = @'
      "name","rehire","bonus"
      Ashleigh,5/8/2019,500.00
      Hunter,8/25/2019,250.00
      Steve,10/22/2019,100.00
      Heather,7/13/2019,600.00
      Sarah,10/5/2019,50.00
      Alice,5/5/2019,6000.00
      Andy,1/8/2015,2.00
      '@ | Convertfrom-Csv
      
      $IntermediateResult = Compare-Object -ReferenceObject $CSVData01 -DifferenceObject $CSVData02 -Property name -IncludeEqual -PassThru |
      Select-Object -Property Name, start_date, end_date
      
      $outarray = @(
          foreach ($DataSet01 in $IntermediateResult) {
              $DataSet02 = $CSVData02 | Where-Object { $DataSet01.name -eq $_.name }
              [PSCustomObject]@{
                  name       = $DataSet01.name
                  start_date = $DataSet01.start_date
                  end_date   = $DataSet01.end_date
                  rehire     = $DataSet02.rehire
                  bonus      = $DataSet02.bonus
              }
          }
      )
      $outarray | 
          Format-Table -AutoSize
    • #193141
      Participant
      Topics: 13
      Replies: 1683
      Points: 2,849
      Helping Hand
      Rank: Community Hero

      Nice job thus far for a beginner! There isn’t any default cmdlets for joining objects, so it is a manual process unless you use a function like Join-Object. Basically, you are doing a FULL JOIN from a SQL standpoint. You’ve gotten the matches, so you need to identify the records that didn’t match and create a new object. The $results.Name is an implicit loop, so it’s creating an array of the names and the -notcontains searches that array for the names not contained in the $CSVData02 object. I’m only mentioning what it’s doing because it a shortcut to generate an array. Here is a an example (there was a small issue with CSV 2 for Alice having an extra / vs a , which I fixed in the code too):

      $CSVData01 = @'
      "name","start_date","end_date"
      Ashleigh,1/5/2019,4/5/2019
      Hunter,5/8/2018,8/8/2018
      Steve,6/25/2019,9/25/2019
      Jason,11/9/2015,2/9/2016
      Heather,3/22/2018,6/22/2018
      Sarah,7/16/2019,10/16/2019
      Jane,1/1/2017,4/1/2017
      '@ | Convertfrom-Csv
      
      $CSVData02 = @'
      "name","rehire","bonus"
      Ashleigh,5/8/2019,500.00
      Hunter,8/25/2019,250.00
      Steve,10/22/2019,100.00
      Heather,7/13/2019,600.00
      Sarah,10/5/2019,50.00
      Alice,5/5/2019,6000.00
      Andy,1/8/2015,2.00
      '@ | Convertfrom-Csv
      
      
      $results = foreach ($DataSet01 in $CSVData01) {
          $DataSet02 = $CSVData02 | Where-Object {$DataSet01.name -eq $_.name}
      
          [PSCustomObject]@{
              name       = $DataSet01.name
              start_date = $DataSet01.start_date
              end_date   = $DataSet01.end_date
              rehire     = $DataSet02.rehire
              bonus      = $DataSet02.bonus
          }
      }
      
      #Get the other object and Select-Object generates a new PSObject that matches the schema returned in $results
      $noMatch = $CSVData02 | 
                 Where{$results.Name -notcontains $_.Name} | 
                 Select name, start_date, end_date, rehire, bonus
      
      #Create an empty array and then add both schemas
      $joined = @()
      $joined += $results
      $joined += $noMatch
      
      'Matched: {0}' -f $results.Count
      'Not Matched: {0}' -f $noMatch.Count
      'Joined: {0}' -f $joined.Count
      
      $joined | Format-Table
      
    • #193150
      Participant
      Topics: 1
      Replies: 1
      Points: 14
      Rank: Member

      Thank y’all so much! You really are community heroes!

    • #194777
      Participant
      Topics: 0
      Replies: 1
      Points: -10
      Rank: Member

      I hope it’s alright to expand on this thread a little. I’m trying to accomplish the same thing, only at a much larger scale.

      I have CSVs that I’m importing with close to 50k rows (Or in this case, unique users).

      Importing the CSVs only takes about a second, but after that it gets stuck.

    • #194792
      Participant
      Topics: 5
      Replies: 2411
      Points: 6,193
      Helping Hand
      Rank: Community MVP

      Normaly you would create your own new thread for your question and reference this thread with a link if needed.

      It’s gonna be hard to give a reasonable advice if you do not show your code.

Viewing 5 reply threads
  • The topic ‘Combining data – how to also add data that is in set2 and not set1?’ is closed to new replies.