How to Match 2 CSV\Excel file data

This topic contains 10 replies, has 3 voices, and was last updated by  SG 4 months, 3 weeks ago.

  • Author
    Posts
  • #70982

    SG
    Participant

    I have data in 2 different CSV\excel format, WHERE I need to map the data between 2 files.
    If first 3 attributes matches (BUCode, PrdID, Line#), then consolidate into 3rd file, else exception in 4th file.
    See the illustration:
    File-1
    BUCode PrdID Line# Region Amount
    USNorth A001 A001-2 US 100
    USSouth A001 A001-3 US 200
    USNorth A003 A003-1 Can 500
    USNorth A004 A004-2 US 400
    USNorth A002 A002-2 US 200

    File-2
    BUCode PrdID Line# SaleRep
    USNorth A001 A001-2 Marc
    USSouth A001 A001-3 Sam
    USNorth A003 A003-1 Rony
    USEast A004 A004-2 Mac
    USNorth A002 A002-2 Marc

    FinalOutput-3
    BUCode PrdID Line# Region SaleRep Amount
    USNorth A001 A001-2 US Marc 100
    USSouth A001 A001-3 US Sam 200
    USNorth A003 A003-1 Can Rony 500
    USNorth A002 A002-2 US Marc 200

    FinalOutput-4
    USNorth A004 A004-2 US – 400

    Could someone help on this.

    Thanks
    Shishir

  • #71033

    Matt Bloomfield
    Participant

    This is a pretty straightforward problem but we'd like to see some of the code you're having problems with rather than write you a script from scratch. What code have you come up with so far?

  • #71077

    SG
    Participant

    #import the contents of both csv files
    $dbexcel=import-csv book21.csv
    $Lexcel=import-csv book22.csv

    #prepare the output csv and create the headers
    $outputexcel="Outputexcel.csv"
    $outputline= "BuCode" + "PrdID" + "Line" + "SalesRep" + "Amount"
    $outputline | out-file $outputexcel

    for ($i=0; $i -le $lexcel.Length-1;$i++)
    {
    # Assign the yes / null values to equal the word equivalent
    if ($dbexcel.isavail[$i] -eq $lexcel.isavail[$i] ) {"Available"} else {"Unavailable"}

    #create the live of csv content from the two input csv files
    $outputline= $dbexcel + " + " + $lexcel
    #output that line to the csv file
    $outputline | out-file $outputexcel -Append

    }

  • #71107

    Hi,
    Try this out

    $csv1 = Import-Csv -Delimiter ";" -Path C:\temp\file1.csv
    $csv2 = Import-Csv -Delimiter ";" -Path C:\temp\file2.csv
    
    $matched = @()
    $notMatched = @()
    
    foreach ($row in $csv1) {
        $match = $csv2 | Where-Object {
            $_.BUCode -eq $row.BUCode -and 
            $_.PrdID -eq $row.PrdID -and 
            $_.'Line#' -eq $row.'Line#' }
    
        if ([string]::IsNullOrEmpty($match)) {
    
            $notMatched += $row
        
        }
        else {
    
            $matched += $row | Select-Object BUCode,PrdID,'Line#',
                                             @{Label = 'SaleRep'; Expression = {$match.SaleRep}},
                                             Amount
        }
    }
    
    if ($matched.Count -ge 1) {
        $matched | Export-Csv -Path C:\temp\file3.csv -Delimiter ';' -NoTypeInformation
    }
    
    if ($notMatched.Count -ge 1) {
        $notMatched | Export-Csv -Path C:\temp\file4.csv -Delimiter ';' -NoTypeInformation
    }
    
  • #71108

    Matt Bloomfield
    Participant

    Your logic isn't quite right.

    For every row in the first file, you want to check if the first three columns match the data in the second file.
    If a match is found, you want to add the data from the second file to the row and export it to a new CSV file. You can do this using a custom object.
    If no match is found, you want to just export the current row to a new file.
    I think the code is fairly self explanatory with the cmdlet names but if you would like any clarification, please let me know:

    $csv1 = Import-Csv 1.csv
    $csv2 = Import-Csv 2.csv
    
    foreach ($r in $csv1) {
    
        $found = $false
    
        foreach ($s in $csv2) {
    
            if (($r.BUCode -eq $s.BUCode) -and ($r.PrdID -eq $s.PrdID) -and ($r.'line#' -eq $s.'line#')) {
                
                $found = $true
    
                $obj = [PSCustomObject] @{
                    BUCode  = $r.BUCode
                    PrdID   = $r.PrdID
                    'Line#' = $r.'Line#'
                    Region  = $r.Region
                    SaleRep = $s.SaleRep
                    Amount  = $r.Amount
                        
                } #end object creation
    
                $obj | Export-Csv matches.csv -Append -NoClobber -NoTypeInformation
    
            } #endif
    
        } #end foreach $s
    
        if (-not $found) {
    
            $obj = [PSCustomObject] @{
                BUCode  = $r.BUCode
                PrdID   = $r.PrdID
                'Line#' = $r.'Line#'
                Region  = $r.Region
                Amount  = $r.Amount
    
            } #end object creation
    
            $obj | Export-Csv noMatches.csv -Append -NoClobber -NoTypeInformation
        
        } #endif
    
    }#end foreach $r
    
  • #71110

    Matt Bloomfield
    Participant

    Nice approach, Aleksandras. I like that you did it without using a nested foreach statement although using Measure-Command the nested foreach seems slightly faster. Be interesting to see how it scales to big CSV files.

  • #71114

    Thanks Matt, I always try to avoid nested loops if possible, because it is more confusing later on, at least to me. Measured Where-Object and where() method. where() runs a bit faster

        Measure-Command {
        $match = $csv2.where({
            $_.BUCode -eq $row.BUCode -and 
            $_.PrdID -eq $row.PrdID -and 
            $_.'Line#' -eq $row.'Line#' })
        } | select Milliseconds, Ticks
    

    Milliseconds Ticks
    ———— —–
    7 77261
    0 1486
    0 1141
    0 1055
    0 1239

        Measure-Command {
        $match = $csv2 | where {
            $_.BUCode -eq $row.BUCode -and 
            $_.PrdID -eq $row.PrdID -and 
            $_.'Line#' -eq $row.'Line#' }
        } | select Milliseconds, Ticks
    

    Milliseconds Ticks
    ———— —–
    7 77015
    0 3604
    0 3000
    0 2914
    0 3366

  • #71152

    SG
    Participant

    Thanks Matt & Aleksandras...I will try both the approach and will share an update...Shishir.

  • #71294

    SG
    Participant

    The code works for me...thanks a lot.

    Also, I need to add the 2 new columns in the destination table based on the below calculation. I tried this code,
    but did not work:
    {
    $matched += $row | Select-Object BUCode,PrdID, 'Line#',
    @{Label = 'SaleRep'; Expression = {$match.SaleRep}},
    "Daily Gross Amount"
    , @{l = 'Daily Net Amount' } ; e= {("Daily Gross Amount" – ("Daily Gross Amount" * 0.10))}}
    ## "The Net Amount is less 10% of the Gross Amount"

    , @{l = 'Discount Amount' } ; e= {(if("Daily Net Amount" > 2000) {"Daily Net Amount" * 0.05},
    if("Daily Net Amount" > 5000) {"Daily Net Amount" * 0.10} else {0}) }}
    }

    Could you please help in to correct, where I'm wrong.

    Shishir

  • #71516

    SG
    Participant

    Could you please help in to correct, where I'm wrong

    • #71530

      SG
      Participant

      Thanks, I am able to fix the issues.

You must be logged in to reply to this topic.