Welcome Forums General PowerShell Q&A Powershell – Merging csv files, depending on filename.

Viewing 5 reply threads
  • Author
    Posts
    • #171673
      Participant
      Topics: 3
      Replies: 9
      Points: 59
      Rank: Member

      Hi

      I have tried to find a solution to my problem, but can´t find the solution I am looking for, so hopefully, someone in here can help me.

      I have millions of .csv files that I need to merge by a wildcards match on the file name, the merged .csv file needs to be divided by some of the filenames and only contain one header.

      Let´s say I have the following files.

      0006041651_00017771_20190820.csv
      0006041651_00017771_20190819.csv
      0006041651_00017771_20190818.csv
      0006041651_00017771_20190817.csv
      0006041651_00017771_20190816.csv

      0006041651_00013333_20190820.csv
      0006041651_00013333_20190819.csv
      0006041651_00013333_20190818.csv
      0006041651_00013333_20190817.csv
      0006041651_00013333_20190816.csv

      What I need. Is something that can match some of the filename *00017771* merge it with all the other files in the folder that contains *00017771* and only keep one header for all the merged data.
      When done with *00017771* it should move on to the next *00013333* and so on.

      I think I need to set something that knows what I am looking for in the file name, is the second _ in the filename, and ends with the third _

      xxxxxxxxxxx_00017771_xxxxxxxx.csv

      Hope this is not too confusing 🙂

       

    • #171706
      Senior Moderator
      Topics: 8
      Replies: 1158
      Points: 4,045
      Helping Hand
      Rank: Community Hero

      You can have a grouped object using calculated properties and Group-Object cmdlet then merge with the help of ForEach-Object cmdlet.

      $FileGroup = Get-ChildItem -Path <csv files located folder> | Select-Object -Property FullName,@{E={($_ -split '_')[1]};L='Pattern'} | Group-Object -Property Pattern
      
      $FileGroup | ForEach-Object -Process {Import-Csv -Path $_.Group.FullName | Export-Csv -Path <NewMergedCsvPath> -NoTypeInformation}
      
    • #171715
      Participant
      Topics: 9
      Replies: 430
      Points: 730
      Helping Hand
      Rank: Major Contributor
      #Requires -Version 5
      #Requires -Modules AZSBTools
      
      # Merge CSV files based on common 2nd block in file name
      
      
      # Read metadata/file names
      $FolderPath = 'C:\scripts'
      $FileList = Get-ChildItem -Path $FolderPath -File -Filter *.CSV
      Write-Log 'Identified',$FileList.Count,'CSV files in folder',$FolderPath Green,Cyan,Green,Cyan
      
      # Merge files
      $SecondBlockList = $FileList.Name | foreach { $_.Split('_')[1] } | select -Unique
      Write-Log 'Identified',$SecondBlockList.Count,'CSV file groups' Green,Cyan,Green
      
      foreach ($FileGroup in $SecondBlockList) {
          foreach ($FileName in ($FileList.FullName -match $FileGroup)) {
              Import-Csv $FileName | Export-Csv "$FileGroup.CSV" -Append
          }
      }
      
      Write-Log 'done' Darkyellow
      
      • #171904
        Participant
        Topics: 3
        Replies: 9
        Points: 59
        Rank: Member

        Hi Sam

        Thank you for the help

        I have version 5, and module AZSBTools, but get the following error for each file.

        Export-Csv : Cannot append CSV content to the following file: 50748463.CSV. The appended object does not have a property that corresponds to the following column: 0006041651;50748463;2019-06-27 03:00:00;00;5074846
        3;2019-06-27 00:13:00;263. To continue with mismatched properties, add the -Force parameter, and then retry the command.
        At C:\VK_Powershell\combine_csv_files.ps1:18 char:32
        + Import-Csv $FileName | Export-Csv "$FileGroup.CSV" -Append
        + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        + CategoryInfo : InvalidData: (0006041651;5074...27 00:13:00;263:String) [Export-Csv], InvalidOperationException
        + FullyQualifiedErrorId : CannotAppendCsvWithMismatchedPropertyNames,Microsoft.PowerShell.Commands.ExportCsvCommand

        I have added the -force parameter at the location suggested. -> At C:\VK_Powershell\combine_csv_files.ps1:18 char:32

        foreach ($FileGroup in $SecondBlockList) {
        foreach ($FileName in ($FileList.FullName -match $FileGroup)) {
        Import-Csv $FileName | Export-Csv "$FileGroup.CSV" -Force -Append
        }
        }

        It now works as intended.

        PS C:\Windows\system32> C:\VK_Powershell\combine_csv_files.ps1
        Identified 173 CSV files in folder \\inputfolder  
        Identified 16 CSV file groups 
        done

        But I am unable to find the merged files anywhere?

    • #171955
      Participant
      Topics: 9
      Replies: 430
      Points: 730
      Helping Hand
      Rank: Major Contributor

      current folder C:\Windows\system32

      • #171961
        Participant
        Topics: 3
        Replies: 9
        Points: 59
        Rank: Member

        Hi Sam

        Thanks once again.

        Is there any way to determine where the output should save the merged files after processing all of them

        Also it looks like the header of the orginal file has not been preserved, instead I have a new header, and added “” to all the ,

        , is now “,”

        #TYPE System.Management.Automation.PSCustomObject
        "0006041651;50748469;2019-08-15 03:15:00;00;50748469;2019-08-15 00:07:00;155","09;2019-12-31 00:00:00;132","63;2018-12-31 00:00:00;99","77;2017-12-31 00:00:00;1511;2015-02-10 00:00:00;0;14"
        "0006041651;50748469;2019-08-15 03:30:00;00;50748469;2019-08-15 00:22:00;155","09;2019-12-31 00:00:00;132","63;2018-12-31 00:00:00;99","77;2017-12-31 00:00:00;1511;2015-02-10 00:00:00;0;14"

        Original

        #serial-number;device-identification;created;value-data-count;fabrication-no,,inst-value,0,0,0;datetime,,inst-value,0,0,0;volume,m3,inst-value,0,0,0;date,,inst-value,0,0,0;volume,m3,inst-value,0,0,1;date,,inst-value,0,0,1;volume,m3,inst-value,0,0,2;date,,inst-value,0,0,2;on-time,day(s),inst-value,0,0,0;date,,inst-value,0,0,5;error-flags-dev-spec,,inst-value,0,0,0;fw-version,,inst-value,0,0,0
        0006041651;50748469;2019-08-18 03:15:00;00;50748469;2019-08-18 00:07:00;155,44;2019-12-31 00:00:00;132,63;2018-12-31 00:00:00;99,77;2017-12-31 00:00:00;1514;2015-02-10 00:00:00;0;14

         

         

    • #171970
      Participant
      Topics: 9
      Replies: 430
      Points: 730
      Helping Hand
      Rank: Major Contributor
      #Requires -Version 5
      #Requires -Modules AZSBTools
      
      # Merge CSV files based on common 2nd block in file name
      
      
      # Read metadata/file names
      $FolderPath = 'C:\scripts'
      $FileList = Get-ChildItem -Path $FolderPath -File -Filter *.CSV
      Write-Log 'Identified',$FileList.Count,'CSV files in folder',$FolderPath Green,Cyan,Green,Cyan
      
      # Merge files
      $SecondBlockList = $FileList.Name | foreach { $_.Split('_')[1] } | select -Unique
      Write-Log 'Identified',$SecondBlockList.Count,'CSV file groups' Green,Cyan,Green
      
      foreach ($FileGroup in $SecondBlockList) {
          foreach ($FileName in ($FileList.FullName -match $FileGroup)) {
              Import-Csv $FileName | Export-Csv "$FolderPath\$FileGroup.CSV" -Append -NoTypeInformation
          }
      }
      
      Write-Log 'done' Darkyellow
      

      Updated line 18 puts the output files in the same folder as the input files line 8 and removes ‘type information line 1 in output files.

      • #172186
        Participant
        Topics: 3
        Replies: 9
        Points: 59
        Rank: Member

        Thx once again, I now have to option to redirect them.

        Do you know how to get it to stop adding “” to my commas?

        , becomes “,”

        00:07:00;155“,”09;2019-12-31 00:00:00;132“,”63;2018-12-31 00:00:00;99“,”77;

        "0006041651;50748469;2019-08-15 03:15:00;00;50748469;2019-08-15 00:07:00;155","09;2019-12-31 00:00:00;132","63;2018-12-31 00:00:00;99","77;2017-12-31 00:00:00;1511;2015-02-10 00:00:00;0;14"
        "0006041651;50748469;2019-08-15 03:30:00;00;50748469;2019-08-15 00:22:00;155","09;2019-12-31 00:00:00;132","63;2018-12-31 00:00:00;99","77;2017-12-31 00:00:00;1511;2015-02-10 00:00:00;0;14"
    • #172198
      Participant
      Topics: 9
      Replies: 430
      Points: 730
      Helping Hand
      Rank: Major Contributor

      Look into the -Delimiter parameter of the Export-Csv cmdlet

Viewing 5 reply threads
  • The topic ‘Powershell – Merging csv files, depending on filename.’ is closed to new replies.