huuuge CSV needs editing during import

Welcome Forums General PowerShell Q&A huuuge CSV needs editing during import

This topic contains 3 replies, has 4 voices, and was last updated by

 
Participant
1 year ago.

  • Author
    Posts
  • #82630

    Participant
    Points: 0
    Rank: Member

    I have a CSV that needs some formatting during import.

    The first 4 lines need to be removed completely.
    The 5th line includes the header.

    I've tried to use the import-csv cmdlet but it's importing everything and haven't found a way to remove the lines.
    Using GC takes a very long time.
    I've used get-content c:/temp/filename.csv" | select -skip 4 | importfrom-csv
    This also took a very long time and froze my machine

    Adam Bertram suggested adding the "-RAW" switch to the GC cmdlet but it's still taking a long long time and freezing my box.

  • #82639

    Participant
    Points: 0
    Rank: Member

    You could use the readcount parameter for GC and set it to 0 or maybe 1000. You could also use .Net streamreader and write the output on the fly as you go through the input file so it isn't trying to store it all in ram. Here is an example:

    $Hugefile = New-Object System.IO.StreamReader -Arg "MyHuuuugeFile.txt"
    while ($Line = $Hugefile.ReadLine())
    {
    #example to do something with the line before saving
    $Line = $Line -replace("aaaaa","bbb")
    
    # save line to new file
    $Line | Out-File 'MyNewHuuuuggeFile.txt' -Append
    
    }
    $Hugefile.Close()
    
  • #82642
    Jon

    Participant
    Points: 25
    Rank: Member

    Have you tried using import-csv? How large is the file?

  • #82648

    Participant
    Points: 4
    Rank: Member

    Chrissy LeMaire has blogged on techniques for working with large CSV files. You might find this article has useful pointers:

    Quickly Find Duplicates in Large CSV Files using PowerShell

The topic ‘huuuge CSV needs editing during import’ is closed to new replies.