Comparing Large files

Welcome Forums General PowerShell Q&A Comparing Large files

This topic contains 4 replies, has 3 voices, and was last updated by

12 months ago.

  • Author
  • #89531

    Points: 1
    Rank: Member

    I am able to compare large files, But i only need the new lines in the CurrentDwnld.txt file.
    (Meaning If there is an extra line in EarlierDwnld.txt , I do not want it in the final output)
    Also I want to avoid lopping through each line in the file as it is huge.
    Any thoughts would be really appreciated.

    $File_Path = "C:\temp\"
    $File_CurrentDwnld = $File_Path  + "File_CurrentDwnld.txt"
    $File_EarlierDwnld = $File_Path  + "File_EarlierDwnld.txt"
    $Compare_Download = compare-object (get-content $File_CurrentDwnld) (get-content $File_EarlierDwnld)
    $Compare_Download = $Compare_Download.InputObject
    $File_Difference = $File_Path  + "File_Difference.txt"
    $Compare_Download > $File_Difference
  • #89534

    Points: 1,704
    Helping HandTeam Member
    Rank: Community Hero

    Doing a get-content on a huge file is going to consume a lot of memory, keep in mind. It might be worth putting up with the slower speed.

    That said, with large files, this is a case where I'd turn to an outside utility, not use Compare-Object. There are better, and far faster, text file comparison tools that are written in C++ and offer a lot more flexibility.

    • #89540

      Points: 1
      Rank: Member

      Don, Appreciate your Input, although the script above is working fine (even for the large files I am using).
      All I need is the new lines on the output file. Please see earlier post for details.

  • #89561

    Points: 85
    Rank: Member
    # Make test files
    Remove-Item '.\test1.txt','.\test2.txt' -Force -EA 0 
    $FilePath1 = '.\test1.txt'
    $FilePath2 = '.\test2.txt'
    1..1000 | % { "Exhale completely through your mouth, making a whoosh sound $_" | Out-File $FilePath1 -Append } 
    5..1005 | % { "Exhale completely through your mouth, making a whoosh sound $_" | Out-File $FilePath2 -Append } 
    # Note that this is the slowest part of the process by far - using DotNet
    # Read File 1 using COM Object which is faster than DotNet
    $fso = New-Object -ComObject 'Scripting.FileSystemObject'
    $FileObj1 = $fso.OpenTextFile($((Get-Item $FilePath1).FullName),1)
    $File1Lines = while (! $FileObj1.AtEndOfStream ) { $FileObj1.ReadLine() }
    # Read each line of file 1 and compare to file 2 lines, recording lines that do NOT match
    $FileObj2 = $fso.OpenTextFile($((Get-Item $FilePath2).FullName),1)
    $LinesIn2ButNotIn1 = while (! $FileObj2.AtEndOfStream ) { 
        $Line = $FileObj2.ReadLine()
        if ($Line -notin $File1Lines ) { $Line } 
    # To get lines in 1 but not in 2, you do the opposite:
    $FileObj2 = $fso.OpenTextFile($((Get-Item $FilePath2).FullName),1)
    $File2Lines = while (! $FileObj2.AtEndOfStream ) { $FileObj2.ReadLine() }
    $FileObj1 = $fso.OpenTextFile($((Get-Item $FilePath1).FullName),1)
    $LinesIn1ButNotIn2 = while (! $FileObj1.AtEndOfStream ) { 
        $Line = $FileObj1.ReadLine()
        if ($Line -notin $File2Lines ) { $Line } 
    • #89564

      Points: 1
      Rank: Member

      Thanks Sam.
      Really appreciate the script

The topic ‘Comparing Large files’ is closed to new replies.