Merging data.

This topic contains 3 replies, has 3 voices, and was last updated by Profile photo of Hassan Serajian Hassan Serajian 2 months ago.

  • Author
    Posts
  • #62269
    Profile photo of Gorstag
    Gorstag
    Participant

    Hello,

    I am trying to figure out how to merge data to build a specific array that can be exported out to CSV. I have already reached the point where I can build the following array which provides all of the raw data that can be exported out to CSV.

    $results = for ($i = 0; $i -lt $emailaddress.count; $i++) {
        New-Object psobject -Property @{
            EmailUser = $user[$i]
            EmailDomain = $domain[$i]
            EmailAddress = $emailaddress[$i]
            Disposition = $disposition[$i]
            Date = $date[$i]
            Time = $time[$i]
        }
    }
    
    

    What I am trying to do is make a second CSV that only lists each unique email address and the count for each disposition. So to start on that I created 5 different "arrays" of data that all have one common element "emailaddress" This was done by searching for each disposition and grouping them by email address giving a unique count for each disposition.

    Here are the arrays in question:

    $Total = for ($i = 0; $i -lt $Total.count; $i++) {
        New-Object psobject -Property @{
            'EmailAddress' = $Total[$i].Name
            'Total' = $Total[$i].Count
    
        }
    }
    
    
    $Tnone = for ($i = 0; $i -lt $Tnone.count; $i++) {
        New-Object psobject -Property @{
            'EmailAddress' = $Tnone[$i].Name
            'None' = $Tnone[$i].Count
    
        }
    }
    
    $Tspam = for ($i = 0; $i -lt $Tspam.count; $i++) {
        New-Object psobject -Property @{
            'EmailAddress' = $Tspam[$i].Name
            'Spam' = $Tspam[$i].Count
    
        }
    }
    
    $Topl = for ($i = 0; $i -lt $Topl.count; $i++) {
        New-Object psobject -Property @{
            'EmailAddress' = $Topl[$i].Name
            'OPL' = $Topl[$i].Count
    
        }
    }
    
    $Tgray = for ($i = 0; $i -lt $Tgray.count; $i++) {
        New-Object psobject -Property @{
            'EmailAddress' = $Tgray[$i].Name 
            'Gray' = $Gray[$i].Count
    
        }
    }
    

    What I am trying to figure out is how to join all the data together to end up with a CSV that looks like:

    Emailaddress, Total, None, Spam, OPL, Gray
    Bob@example.com, 50, 30, 9, 10, 1
    Tom@example.com, 72, 19, 21, 30, NULL
    Sue@example.com, 3, 2, NULL, 1, NULL

    Instead I am getting something like:

    Emailaddress, Total, None, Spam, OPL, Gray
    Bob@example.com, 50, NULL, NULL, NULL, NULL
    Tom@example.com, 72, NULL, NULL, NULL, NULL
    Sue@example.com, 3, NULL, NULL, NULL, NULL
    Bob@example.com, NULL, 30, NULL, NULL, NULL
    Tom@example.com, NULL, 19, NULL, NULL, NULL
    Sue@example.com, NULL, 2, NULL, NULL, NULL

    When I try to join them using:

    $finaltotal = @()
    $finaltotal += $total | select EmailAddress,Total,None,Spam,OPL,Gray
    $finaltotal += $Tnone | select EmailAddress,Total,None,Spam,OPL,Gray
    $finaltotal += $tspam | select EmailAddress,Total,None,Spam,OPL,Gray
    $finaltotal += $topl | select EmailAddress,Total,None,Spam,OPL,Gray
    $finaltotal += $Tgray | select EmailAddress,Total,None,Spam,OPL,Gray
    

    Any idea how to do this properly?

  • #62272
    Profile photo of Gorstag
    Gorstag
    Participant

    Okay, I missed a response from someone else on a different forum. Here is the solution using the first data set.

    $data = @"
    EmailUser,EmailDomain,EmailAddress,Disposition
    john,domain.com,john@domain.com,
    john,domain.com,john@domain.com,
    john,domain.com,john@domain.com,
    john,domain.com,john@domain.com,
    matt,domain.com,matt@domain.com,
    matt,domain.com,matt@domain.com,
    matt,domain.com,matt@domain.com,
    matt,domain.com,matt@domain.com,
    matt,domain.com,matt@domain.com,
    matt,domain.com,matt@domain.com,
    "@ | ConvertFrom-Csv
    
    $uniqueEmails = $data.EmailAddress | Sort -Unique
    $dispositionResults = $data | Select EmailAddress,Disposition | Group-Object EmailAddress,Disposition | Select @{N='EmailAddress';E={$_.Name.Split(',')[0].trim()}},@{N='Disposition';E={$_.Name.Split(',')[-1].trim().trim('')}},Count
    
    $results = foreach ($email in $uniqueEmails) {
                    $tmpObj = [pscustomobject]@{EmailAddress='';Total=0;None=0;Spam=0;OPL=0;Gray=0}
                    $tmpObj.EmailAddress = $email
                    $total = 0
    
                    $dispositionResults | ? { $_.EmailAddress -eq $email } | % {
    
                        $total += $_.Count
                        $tmpObj.($_.Disposition) = $_.Count
    
                    }
    
                    $tmpObj.Total = $total
    
                    Write-Output $tmpObj
                }
    

    I am not sure how this works so I will figure it out and learn. But it does work and did test it against my data set multiple times.

    • #62341
      Profile photo of Iron
      Iron
      Participant

      Just a quick suggestion. Since you're effectively enumerating your $results and performing a filter action on them, I would recommend moving your write-output command within the foreach loop. That way, if you turn this into a function, every iteration of foreach can output data into the pipeline for further data manipulation (like | format-table or select-object).

      Example

      $data = @"
      EmailUser,EmailDomain,EmailAddress,Disposition
      john,domain.com,john@domain.com,
      john,domain.com,john@domain.com,
      john,domain.com,john@domain.com,
      john,domain.com,john@domain.com,
      matt,domain.com,matt@domain.com,
      matt,domain.com,matt@domain.com,
      matt,domain.com,matt@domain.com,
      matt,domain.com,matt@domain.com,
      matt,domain.com,matt@domain.com,
      matt,domain.com,matt@domain.com,
      "@ | ConvertFrom-Csv
      
      $uniqueEmails = $data.EmailAddress | Sort -Unique
      $dispositionResults = $data | Select EmailAddress,Disposition | Group-Object EmailAddress,Disposition | Select @{N='EmailAddress';E={$_.Name.Split(',')[0].trim()}},@{N='Disposition';E={$_.Name.Split(',')[-1].trim().trim('')}},Count
      
      $results = foreach ($email in $uniqueEmails) {
                      $tmpObj = [pscustomobject]@{EmailAddress='';Total=0;None=0;Spam=0;OPL=0;Gray=0}
                      $tmpObj.EmailAddress = $email
                      $total = 0
      
                      $dispositionResults | ? { $_.EmailAddress -eq $email } | % {
      
                          $total += $_.Count
                          $tmpObj.($_.Disposition) = $_.Count
                          Write-Output $tmpObj
                      }
      
                      
                  }
      
  • #62355
    Profile photo of Hassan Serajian
    Hassan Serajian
    Participant

    I came up with a template recently that I have been using which I thought it might be helpful. It is a simple way of merging data to single CSV with PowerShell and you might already achieved your goal but
    considering the CSV has content with these columns, 'Emailaddress', 'Total', 'None', 'Spam', 'OPL', and'Gray' then the following template can be updated with those column names to be used to merge data.
    How do you insert these data, 'None', 'Spam', 'OPL', and'Gray' into CSV?

    $OriginalCsvFileContent = Import-Csv .\CsvFileName.csv

    For($n=0;$n -le $OriginalCsvFileContent.Count;$n++) {
    $EmailUser = $OriginalCsvFileContent[$n].EmailUser | Out-File EmailUser.txt -Append
    $EmailDomain = $OriginalCsvFileContent.EmailDomain | Out-File EmailDomain.txt -Append
    $EmailAddress = $OriginalCsvFileContent.EmailAddress | Out-File EmailAddress.txt -Append
    $Disposition = $OriginalCsvFileContent.Disposition | Out-File Disposition.txt -Append
    $EmailAddressCount = (Get-Content .\EmailAddress.txt).count
    $result = -join($OriginalCsvFileContent[$n].EmailAddress,$Disposition,"CanAddMoreHere")
    }

    Import-Csv .\EmailUser.txt -Header EmailUser -Delimiter "," | Export-Csv -NoTypeInformation EmailUser.csv
    Import-Csv .\EmailDomain.txt -Header EmailDomain -Delimiter "," | Export-Csv -NoTypeInformation EmailDomain.csv
    Import-Csv .\EmailAddress.txt -Header EmailAddress -Delimiter "," | Export-Csv -NoTypeInformation EmailAddress.csv
    Import-Csv .\Disposition.txt -Header Disposition -Delimiter "," | Export-Csv -NoTypeInformation Disposition.csv

    #From this point forward column exporting can be selective as desired

    $CSV1= Get-Content .\EmailUser.csv
    $CSV2= Get-Content .\EmailDomain.csv
    $CSV3= Get-Content .\EmailAddress
    $CSV4= New-Object System.Collections.ArrayList
    $CSV5= Get-Content .\EmailUser.csv

    for($n=0;$n -lt $CSV1.count;$n++) {
    $CSV1[$n] + $CSV2[$n] + $CSV3[$n] + $CSV5[$n] | Out-File merged.csv -Append
    }

    Get-Content merged.csv

    #in the script each of these variables are doing the same thing; $Tgray,$Topl,$Tspam,$Tnone,$Total..

You must be logged in to reply to this topic.