Powershell - Merging csv files, depending on filename.

Welcome Forums General PowerShell Q&A Powershell - Merging csv files, depending on filename.

This topic contains 8 replies, has 3 voices, and was last updated by

 
Participant
1 month ago.

  • Author
    Posts
  • #171673

    Participant
    Topics: 3
    Replies: 9
    Points: 59
    Rank: Member

    Hi

    I have tried to find a solution to my problem, but can´t find the solution I am looking for, so hopefully, someone in here can help me.

    I have millions of .csv files that I need to merge by a wildcards match on the file name, the merged .csv file needs to be divided by some of the filenames and only contain one header.

    Let´s say I have the following files.

    0006041651_00017771_20190820.csv
    0006041651_00017771_20190819.csv
    0006041651_00017771_20190818.csv
    0006041651_00017771_20190817.csv
    0006041651_00017771_20190816.csv

    0006041651_00013333_20190820.csv
    0006041651_00013333_20190819.csv
    0006041651_00013333_20190818.csv
    0006041651_00013333_20190817.csv
    0006041651_00013333_20190816.csv

    What I need. Is something that can match some of the filename *00017771* merge it with all the other files in the folder that contains *00017771* and only keep one header for all the merged data.
    When done with *00017771* it should move on to the next *00013333* and so on.

    I think I need to set something that knows what I am looking for in the file name, is the second _ in the filename, and ends with the third _

    xxxxxxxxxxx_00017771_xxxxxxxx.csv

    Hope this is not too confusing 🙂

     

  • #171706

    Senior Moderator
    Topics: 8
    Replies: 1041
    Points: 3,439
    Helping Hand
    Rank: Community Hero

    You can have a grouped object using calculated properties and Group-Object cmdlet then merge with the help of ForEach-Object cmdlet.

    $FileGroup = Get-ChildItem -Path  | Select-Object -Property FullName,@{E={($_ -split '_')[1]};L='Pattern'} | Group-Object -Property Pattern
    
    $FileGroup | ForEach-Object -Process {Import-Csv -Path $_.Group.FullName | Export-Csv -Path  -NoTypeInformation}
    
  • #171715

    Participant
    Topics: 9
    Replies: 423
    Points: 676
    Helping Hand
    Rank: Major Contributor
    #Requires -Version 5
    #Requires -Modules AZSBTools
    
    # Merge CSV files based on common 2nd block in file name
    
    
    # Read metadata/file names
    $FolderPath = 'C:\scripts'
    $FileList = Get-ChildItem -Path $FolderPath -File -Filter *.CSV
    Write-Log 'Identified',$FileList.Count,'CSV files in folder',$FolderPath Green,Cyan,Green,Cyan
    
    # Merge files
    $SecondBlockList = $FileList.Name | foreach { $_.Split('_')[1] } | select -Unique
    Write-Log 'Identified',$SecondBlockList.Count,'CSV file groups' Green,Cyan,Green
    
    foreach ($FileGroup in $SecondBlockList) {
        foreach ($FileName in ($FileList.FullName -match $FileGroup)) {
            Import-Csv $FileName | Export-Csv "$FileGroup.CSV" -Append
        }
    }
    
    Write-Log 'done' Darkyellow
    
    • #171904

      Participant
      Topics: 3
      Replies: 9
      Points: 59
      Rank: Member

      Hi Sam

      Thank you for the help

      I have version 5, and module AZSBTools, but get the following error for each file.

      Export-Csv : Cannot append CSV content to the following file: 50748463.CSV. The appended object does not have a property that corresponds to the following column: 0006041651;50748463;2019-06-27 03:00:00;00;5074846
      3;2019-06-27 00:13:00;263. To continue with mismatched properties, add the -Force parameter, and then retry the command.
      At C:\VK_Powershell\combine_csv_files.ps1:18 char:32
      + Import-Csv $FileName | Export-Csv "$FileGroup.CSV" -Append
      + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      + CategoryInfo : InvalidData: (0006041651;5074...27 00:13:00;263:String) [Export-Csv], InvalidOperationException
      + FullyQualifiedErrorId : CannotAppendCsvWithMismatchedPropertyNames,Microsoft.PowerShell.Commands.ExportCsvCommand

      I have added the -force parameter at the location suggested. -> At C:\VK_Powershell\combine_csv_files.ps1:18 char:32

      foreach ($FileGroup in $SecondBlockList) {
      foreach ($FileName in ($FileList.FullName -match $FileGroup)) {
      Import-Csv $FileName | Export-Csv "$FileGroup.CSV" -Force -Append
      }
      }

      It now works as intended.

      PS C:\Windows\system32> C:\VK_Powershell\combine_csv_files.ps1
      Identified 173 CSV files in folder \\inputfolder  
      Identified 16 CSV file groups 
      done

      But I am unable to find the merged files anywhere?

  • #171955

    Participant
    Topics: 9
    Replies: 423
    Points: 676
    Helping Hand
    Rank: Major Contributor

    current folder C:\Windows\system32

    • #171961

      Participant
      Topics: 3
      Replies: 9
      Points: 59
      Rank: Member

      Hi Sam

      Thanks once again.

      Is there any way to determine where the output should save the merged files after processing all of them

      Also it looks like the header of the orginal file has not been preserved, instead I have a new header, and added "" to all the ,

      , is now ","

      #TYPE System.Management.Automation.PSCustomObject
      "0006041651;50748469;2019-08-15 03:15:00;00;50748469;2019-08-15 00:07:00;155","09;2019-12-31 00:00:00;132","63;2018-12-31 00:00:00;99","77;2017-12-31 00:00:00;1511;2015-02-10 00:00:00;0;14"
      "0006041651;50748469;2019-08-15 03:30:00;00;50748469;2019-08-15 00:22:00;155","09;2019-12-31 00:00:00;132","63;2018-12-31 00:00:00;99","77;2017-12-31 00:00:00;1511;2015-02-10 00:00:00;0;14"

      Original

      #serial-number;device-identification;created;value-data-count;fabrication-no,,inst-value,0,0,0;datetime,,inst-value,0,0,0;volume,m3,inst-value,0,0,0;date,,inst-value,0,0,0;volume,m3,inst-value,0,0,1;date,,inst-value,0,0,1;volume,m3,inst-value,0,0,2;date,,inst-value,0,0,2;on-time,day(s),inst-value,0,0,0;date,,inst-value,0,0,5;error-flags-dev-spec,,inst-value,0,0,0;fw-version,,inst-value,0,0,0
      0006041651;50748469;2019-08-18 03:15:00;00;50748469;2019-08-18 00:07:00;155,44;2019-12-31 00:00:00;132,63;2018-12-31 00:00:00;99,77;2017-12-31 00:00:00;1514;2015-02-10 00:00:00;0;14

       

       

  • #171970

    Participant
    Topics: 9
    Replies: 423
    Points: 676
    Helping Hand
    Rank: Major Contributor
    #Requires -Version 5
    #Requires -Modules AZSBTools
    
    # Merge CSV files based on common 2nd block in file name
    
    
    # Read metadata/file names
    $FolderPath = 'C:\scripts'
    $FileList = Get-ChildItem -Path $FolderPath -File -Filter *.CSV
    Write-Log 'Identified',$FileList.Count,'CSV files in folder',$FolderPath Green,Cyan,Green,Cyan
    
    # Merge files
    $SecondBlockList = $FileList.Name | foreach { $_.Split('_')[1] } | select -Unique
    Write-Log 'Identified',$SecondBlockList.Count,'CSV file groups' Green,Cyan,Green
    
    foreach ($FileGroup in $SecondBlockList) {
        foreach ($FileName in ($FileList.FullName -match $FileGroup)) {
            Import-Csv $FileName | Export-Csv "$FolderPath\$FileGroup.CSV" -Append -NoTypeInformation
        }
    }
    
    Write-Log 'done' Darkyellow
    

    Updated line 18 puts the output files in the same folder as the input files line 8 and removes 'type information line 1 in output files.

    • #172186

      Participant
      Topics: 3
      Replies: 9
      Points: 59
      Rank: Member

      Thx once again, I now have to option to redirect them.

      Do you know how to get it to stop adding "" to my commas?

      , becomes ","

      00:07:00;155","09;2019-12-31 00:00:00;132","63;2018-12-31 00:00:00;99","77;

      "0006041651;50748469;2019-08-15 03:15:00;00;50748469;2019-08-15 00:07:00;155","09;2019-12-31 00:00:00;132","63;2018-12-31 00:00:00;99","77;2017-12-31 00:00:00;1511;2015-02-10 00:00:00;0;14"
      "0006041651;50748469;2019-08-15 03:30:00;00;50748469;2019-08-15 00:22:00;155","09;2019-12-31 00:00:00;132","63;2018-12-31 00:00:00;99","77;2017-12-31 00:00:00;1511;2015-02-10 00:00:00;0;14"
  • #172198

    Participant
    Topics: 9
    Replies: 423
    Points: 676
    Helping Hand
    Rank: Major Contributor

    Look into the -Delimiter parameter of the Export-Csv cmdlet

You must be logged in to reply to this topic.