import-csv same header twice

This topic contains 10 replies, has 4 voices, and was last updated by  Andreas Lorensen 8 months, 3 weeks ago.

  • Author
    Posts
  • #67404

    Andreas Lorensen
    Participant

    I have a problem....I have some csv files, and sometime these by error has same headerĀ“field twice, and that makes import-csv not so happy.

    So how do I get around that? I never knows if or which header could be more than once in the csv files. It would be fine, if next ocurance of a header just got appended a number or something.

  • #67407

    Sam Boutros
    Participant

    'Same header field twice', meaning the entire header row duplicated? , or meaning more than one column have the same label?
    If it's the former, I would use Get-Content, compare .[0] and .[1] lines, if they're the same, over-write the source CSV using lines .[1]..[last]
    If it's the latter, I would use Get-Content to read the first line, use .split(',') to break it down into an array of column labels, and check for and repair duplicates if any, write updated data back to CSV..

  • #67410

    Andreas Lorensen
    Participant

    It is "more than one column have the same label"

    Sound like the way to go with Get-Content. Can you give an example how to use that and fix duplicate fields ?

  • #67414

    Andreas Lorensen
    Participant

    OK I am on the right way...

    My test.csv is like this:
    "Header1","Header2","Header1"
    "Nummer1","Nummer2","Nummer3"

    $headers = (Get-Content .\test.csv)[0].Split(',')

    Then I have a aray – how to compare each part whith the rest, and if a duplicate add number to the duplicate ?

    Something like
    foreach ($header in $headers) {
    }

    But how to compare and add?

  • #67417

    Rob Simmers
    Participant

    Try this:

    #test.csv
    #Header1,Header2,Header3
    #Header1,Header2,Header3
    #Data1,Data2,Data3
    
    
    $params = @{
        Path = "C:\Users\Rob\Desktop\test.csv"
    }
    
    $test = Get-Content @params
    
    if ($test[0] = $test[1]) {
        $csv = Import-CSV @params | Select * -Skip 1
    }
    else {
        $csv = Import-CSV @params
    }
    
    
    $csv
    
    • #67423

      Andreas Lorensen
      Participant

      thanks – but my "problem" is that there more than one column have the same label.
      My test.csv looks like this – see that Header1 is duplicate
      "Header1","Header2","Header1"
      "Nummer1","Nummer2","Nummer3"

      I figured out how to read firstline, and get that into a aray, but how to compare each part if it is a duplicate, and change it and write it back to the test.cvs – that is my next problem...

  • #67429

    Andreas Lorensen
    Participant

    Well – something like this worsk, but what to do when duplicate is found...

    $headers = (Get-Content -Path .\test.csv)[0].Split(',')
    
    $newheaders = ''
    foreach ( $header in $headers ) 
    { 
      IF ( $newheaders.Contains( $header ) -and $newheaders[$header] -lt 2 ) 
      {
        #I found a duplicate - what to do  
      }
      $newheaders += $header
    }
  • #67434

    Sam Boutros
    Participant
    $headers = (Get-Content .\test1.csv)[0].Split(',') | sort
    
    1..$($headers.Count-1) | % { # Skipping .[0] since it has no prior to compare to..
        if ($headers[$_] -match $headers[$_-1]) { 
            # Found duplicate, increment last character
            $headers[$_] = $headers[$_].Replace($headers[$_][-1],[int]($headers[$_][-1])+1)
        }
    }
    
    "Old headers: $(((Get-Content .\test1.csv)[0].Split(',') | sort) -join ', ')"
    "New headers: $(($headers | sort ) -join ', ')"
    
  • #67435

    Andreas Lorensen
    Participant

    Ohhhh great – I am learning

    Now I got this – It works, but dosnt care if there is more than one duplicate šŸ™

    $headers = (Get-Content -Path .\test.csv)[0].Split(',')
    
    #[array]$newheaders = ''
    
    foreach ( $header in $headers ) 
    { 
      IF ( $newheaders.Contains($header)) 
      {
        $newheaders += ($header.insert(($header.Length -1),"_duplicate"))
      }
      ELSE 
      {
        $newheaders += $header
      }
    }
    
    $newheaders
    
  • #67437

    Curtis Smith
    Participant

    I think the problem you are going to run into the examples so far is that it does not account for the data under the columns, it just gets the headers and unduplicates then and in the process changes the order of the columns so they no longer match their data

    I would take a different approach. I would just add a column number to the end of each column to ensure it is unique. Without knowing the rest of your process, I do not know if this will introduce new challenges, but it seemed like maintaining the column header name exactly was not of the utmost importance.

    input.csv

    "Hea,der1",Header2,"Hea,der1"
    "Nummer1","Nummer2","Nummer3"

    Script

    $content = Get-Content .\input.csv
    $headerline = $content[0]
    $headers = (($headerline |
        Select-String "`"[^`"\r\n]*`"|'[^'\r\n]*'|[^,\r\n]*" -AllMatches).Matches |
        Where-Object {$_.Value}).Value.trim('"') |
        ForEach-Object {
            $i++
            "$($_)_$i"
        }
    $content[0] = "`"$($headers -join '","')`""
    $content | Set-Content .\input.csv
    
    Remove-Variable "i"

    Results

    "Hea,der1_1","Header2_2","Hea,der1_3"
    "Nummer1","Nummer2","Nummer3"
    

You must be logged in to reply to this topic.