huuuge CSV needs editing during import

This topic contains 3 replies, has 4 voices, and was last updated by  Matt Bloomfield 4 weeks ago.

  • Author
    Posts
  • #82630

    Last Name
    Participant

    I have a CSV that needs some formatting during import.

    The first 4 lines need to be removed completely.
    The 5th line includes the header.

    I've tried to use the import-csv cmdlet but it's importing everything and haven't found a way to remove the lines.
    Using GC takes a very long time.
    I've used get-content c:/temp/filename.csv" | select -skip 4 | importfrom-csv
    This also took a very long time and froze my machine

    Adam Bertram suggested adding the "-RAW" switch to the GC cmdlet but it's still taking a long long time and freezing my box.

  • #82639

    Rick
    Participant

    You could use the readcount parameter for GC and set it to 0 or maybe 1000. You could also use .Net streamreader and write the output on the fly as you go through the input file so it isn't trying to store it all in ram. Here is an example:

    $Hugefile = New-Object System.IO.StreamReader -Arg "MyHuuuugeFile.txt"
    while ($Line = $Hugefile.ReadLine())
    {
    #example to do something with the line before saving
    $Line = $Line -replace("aaaaa","bbb")
    
    # save line to new file
    $Line | Out-File 'MyNewHuuuuggeFile.txt' -Append
    
    }
    $Hugefile.Close()
    
  • #82642

    Jon
    Participant

    Have you tried using import-csv? How large is the file?

  • #82648

    Matt Bloomfield
    Participant

    Chrissy LeMaire has blogged on techniques for working with large CSV files. You might find this article has useful pointers:

You must be logged in to reply to this topic.