Author Posts

December 18, 2017 at 3:05 pm

Hi,
I am able to compare large files, But i only need the new lines in the CurrentDwnld.txt file.
(Meaning If there is an extra line in EarlierDwnld.txt , I do not want it in the final output)
Also I want to avoid lopping through each line in the file as it is huge.
Any thoughts would be really appreciated.

$File_Path = "C:\temp\"
$File_CurrentDwnld = $File_Path  + "File_CurrentDwnld.txt"
$File_EarlierDwnld = $File_Path  + "File_EarlierDwnld.txt"
$Compare_Download = compare-object (get-content $File_CurrentDwnld) (get-content $File_EarlierDwnld)
$Compare_Download = $Compare_Download.InputObject

$File_Difference = $File_Path  + "File_Difference.txt"
$Compare_Download > $File_Difference

December 18, 2017 at 3:10 pm

Doing a get-content on a huge file is going to consume a lot of memory, keep in mind. It might be worth putting up with the slower speed.

That said, with large files, this is a case where I'd turn to an outside utility, not use Compare-Object. There are better, and far faster, text file comparison tools that are written in C++ and offer a lot more flexibility.

December 18, 2017 at 3:33 pm

Don, Appreciate your Input, although the script above is working fine (even for the large files I am using).
All I need is the new lines on the output file. Please see earlier post for details.

December 18, 2017 at 6:57 pm

# Make test files
Remove-Item '.\test1.txt','.\test2.txt' -Force -EA 0 
$FilePath1 = '.\test1.txt'
$FilePath2 = '.\test2.txt'
1..1000 | % { "Exhale completely through your mouth, making a whoosh sound $_" | Out-File $FilePath1 -Append } 
5..1005 | % { "Exhale completely through your mouth, making a whoosh sound $_" | Out-File $FilePath2 -Append } 
# Note that this is the slowest part of the process by far - using DotNet

# Read File 1 using COM Object which is faster than DotNet
$fso = New-Object -ComObject 'Scripting.FileSystemObject'
$FileObj1 = $fso.OpenTextFile($((Get-Item $FilePath1).FullName),1)
$File1Lines = while (! $FileObj1.AtEndOfStream ) { $FileObj1.ReadLine() }

# Read each line of file 1 and compare to file 2 lines, recording lines that do NOT match
$FileObj2 = $fso.OpenTextFile($((Get-Item $FilePath2).FullName),1)
$LinesIn2ButNotIn1 = while (! $FileObj2.AtEndOfStream ) { 
    $Line = $FileObj2.ReadLine()
    if ($Line -notin $File1Lines ) { $Line } 
}
$LinesIn2ButNotIn1

# To get lines in 1 but not in 2, you do the opposite:
$FileObj2 = $fso.OpenTextFile($((Get-Item $FilePath2).FullName),1)
$File2Lines = while (! $FileObj2.AtEndOfStream ) { $FileObj2.ReadLine() }
$FileObj1 = $fso.OpenTextFile($((Get-Item $FilePath1).FullName),1)
$LinesIn1ButNotIn2 = while (! $FileObj1.AtEndOfStream ) { 
    $Line = $FileObj1.ReadLine()
    if ($Line -notin $File2Lines ) { $Line } 
}
$LinesIn1ButNotIn2

$FileObj1.Close()
$FileObj2.Close()

December 18, 2017 at 9:11 pm

Thanks Sam.
Really appreciate the script