huuuge CSV needs editing during import

This topic contains 3 replies, has 4 voices, and was last updated by  Matt Bloomfield 7 months ago.

  • Author
  • #82630

    Last Name

    I have a CSV that needs some formatting during import.

    The first 4 lines need to be removed completely.
    The 5th line includes the header.

    I've tried to use the import-csv cmdlet but it's importing everything and haven't found a way to remove the lines.
    Using GC takes a very long time.
    I've used get-content c:/temp/filename.csv" | select -skip 4 | importfrom-csv
    This also took a very long time and froze my machine

    Adam Bertram suggested adding the "-RAW" switch to the GC cmdlet but it's still taking a long long time and freezing my box.

  • #82639


    You could use the readcount parameter for GC and set it to 0 or maybe 1000. You could also use .Net streamreader and write the output on the fly as you go through the input file so it isn't trying to store it all in ram. Here is an example:

    $Hugefile = New-Object System.IO.StreamReader -Arg "MyHuuuugeFile.txt"
    while ($Line = $Hugefile.ReadLine())
    #example to do something with the line before saving
    $Line = $Line -replace("aaaaa","bbb")
    # save line to new file
    $Line | Out-File 'MyNewHuuuuggeFile.txt' -Append
  • #82642


    Have you tried using import-csv? How large is the file?

  • #82648

    Matt Bloomfield

    Chrissy LeMaire has blogged on techniques for working with large CSV files. You might find this article has useful pointers:

You must be logged in to reply to this topic.