Reading CSV with inconsistent formatting

Welcome Forums General PowerShell Q&A Reading CSV with inconsistent formatting

This topic contains 4 replies, has 3 voices, and was last updated by

 
Participant
1 month ago.

  • Author
    Posts
  • #171490

    Participant
    Topics: 13
    Replies: 42
    Points: 243
    Rank: Participant

    I have a huge CSV file that's 300MB, 250k rows. Reading it plaintext, the header row reads as follows:

    IP,"DNS Host Name",Application,Version,Source,"Date First Found","Date Last Updated"

    As you can see, there is a quotation around values that are more than one word.

    I think this is causing a problem when reading it in. It always gives me the warning "One or more headers were not specified". When I output what it read, it gives me just the IP addresses. When I specify the header I want with "-header", the same thing occurs.

    Another weird thing is that the first 5 lines are "Filters:", "Asset Group:", "IP Address or NetBlock:", "Application:", and "IP", without the quotes. I don't see "Filters:", "Asset Group:", or "IP Address or NetBlock:" anywhere in the CSV though. I have no clue where it's getting them from.

    I'm reading the file using:

    Import-CSV "FileName.csv"

    I also tried specifying the delimiter as "," just to make sure nothing funky was going on.

  • #171505

    Participant
    Topics: 9
    Replies: 423
    Points: 676
    Helping Hand
    Rank: Major Contributor

    Please provide the first 5 lines of the csv file as in

    Get-Content .\CSV1.csv  | select -First 5 
    
  • #171508

    Participant
    Topics: 13
    Replies: 42
    Points: 243
    Rank: Participant

    Please provide the first 5 lines of the csv file as in

    Get-Content .\CSV1.csv | select First 5
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

    It says this but I swear that's not what I see when I open it in Visual Studio Code or Wordpad. It would appear that the first line is empty, but I can't see how that's possible.

    Filters:
    "Asset Group:",N/A
    "IP Address or NetBlock:",N/A
    Application:,N/A
  • #171496

    Participant
    Topics: 0
    Replies: 2
    Points: 17
    Rank: Member

    You can try something like this to use your own headers.

    $yourCSVvariable = get-content $InFile |select -skip 1 |convertFrom-csv -Header IP,DNSName,Application,Version,Source,FirstFound,LastUpdated

  • #171517

    Participant
    Topics: 13
    Replies: 42
    Points: 243
    Rank: Participant

    You can try something like this to use your own headers.

    $yourCSVvariable = get-content $InFile |select -skip 1 |convertFrom-csv -Header IP,DNSName,Application,Version,Source,FirstFound,LastUpdated

    That works a lot better!

You must be logged in to reply to this topic.