Author Posts

March 27, 2017 at 6:08 pm

I have a problem....I have some csv files, and sometime these by error has same header´field twice, and that makes import-csv not so happy.

So how do I get around that? I never knows if or which header could be more than once in the csv files. It would be fine, if next ocurance of a header just got appended a number or something.

March 27, 2017 at 6:20 pm

'Same header field twice', meaning the entire header row duplicated? , or meaning more than one column have the same label?
If it's the former, I would use Get-Content, compare .[0] and .[1] lines, if they're the same, over-write the source CSV using lines .[1]..[last]
If it's the latter, I would use Get-Content to read the first line, use .split(',') to break it down into an array of column labels, and check for and repair duplicates if any, write updated data back to CSV..

March 27, 2017 at 6:29 pm

It is "more than one column have the same label"

Sound like the way to go with Get-Content. Can you give an example how to use that and fix duplicate fields ?

March 27, 2017 at 7:12 pm

OK I am on the right way...

My test.csv is like this:
"Header1","Header2","Header1"
"Nummer1","Nummer2","Nummer3"

$headers = (Get-Content .\test.csv)[0].Split(',')

Then I have a aray – how to compare each part whith the rest, and if a duplicate add number to the duplicate ?

Something like
foreach ($header in $headers) {
}

But how to compare and add?

March 27, 2017 at 7:25 pm

Try this:

#test.csv
#Header1,Header2,Header3
#Header1,Header2,Header3
#Data1,Data2,Data3


$params = @{
    Path = "C:\Users\Rob\Desktop\test.csv"
}

$test = Get-Content @params

if ($test[0] = $test[1]) {
    $csv = Import-CSV @params | Select * -Skip 1
}
else {
    $csv = Import-CSV @params
}


$csv

March 27, 2017 at 7:56 pm

thanks – but my "problem" is that there more than one column have the same label.
My test.csv looks like this – see that Header1 is duplicate
"Header1","Header2","Header1"
"Nummer1","Nummer2","Nummer3"

I figured out how to read firstline, and get that into a aray, but how to compare each part if it is a duplicate, and change it and write it back to the test.cvs – that is my next problem...

March 27, 2017 at 8:51 pm

Well – something like this worsk, but what to do when duplicate is found...

$headers = (Get-Content -Path .\test.csv)[0].Split(',')

$newheaders = ''
foreach ( $header in $headers ) 
{ 
  IF ( $newheaders.Contains( $header ) -and $newheaders[$header] -lt 2 ) 
  {
    #I found a duplicate - what to do  
  }
  $newheaders += $header
}

March 27, 2017 at 9:03 pm

$headers = (Get-Content .\test1.csv)[0].Split(',') | sort

1..$($headers.Count-1) | % { # Skipping .[0] since it has no prior to compare to..
    if ($headers[$_] -match $headers[$_-1]) { 
        # Found duplicate, increment last character
        $headers[$_] = $headers[$_].Replace($headers[$_][-1],[int]($headers[$_][-1])+1)
    }
}

"Old headers: $(((Get-Content .\test1.csv)[0].Split(',') | sort) -join ', ')"
"New headers: $(($headers | sort ) -join ', ')"

March 27, 2017 at 10:48 pm

Ohhhh great – I am learning

Now I got this – It works, but dosnt care if there is more than one duplicate 🙁

$headers = (Get-Content -Path .\test.csv)[0].Split(',')

#[array]$newheaders = ''

foreach ( $header in $headers ) 
{ 
  IF ( $newheaders.Contains($header)) 
  {
    $newheaders += ($header.insert(($header.Length -1),"_duplicate"))
  }
  ELSE 
  {
    $newheaders += $header
  }
}

$newheaders

March 28, 2017 at 5:03 am

I think the problem you are going to run into the examples so far is that it does not account for the data under the columns, it just gets the headers and unduplicates then and in the process changes the order of the columns so they no longer match their data

I would take a different approach. I would just add a column number to the end of each column to ensure it is unique. Without knowing the rest of your process, I do not know if this will introduce new challenges, but it seemed like maintaining the column header name exactly was not of the utmost importance.

input.csv

"Hea,der1",Header2,"Hea,der1"
"Nummer1","Nummer2","Nummer3"

Script

$content = Get-Content .\input.csv
$headerline = $content[0]
$headers = (($headerline |
    Select-String "`"[^`"\r\n]*`"|'[^'\r\n]*'|[^,\r\n]*" -AllMatches).Matches |
    Where-Object {$_.Value}).Value.trim('"') |
    ForEach-Object {
        $i++
        "$($_)_$i"
    }
$content[0] = "`"$($headers -join '","')`""
$content | Set-Content .\input.csv

Remove-Variable "i"

Results

"Hea,der1_1","Header2_2","Hea,der1_3"
"Nummer1","Nummer2","Nummer3"

March 28, 2017 at 6:18 am

That is a beatuful way to do it....