Author Posts

September 9, 2015 at 10:22 am

I am working with some very large flat text files and I need to be able to search the file for 2 bits of text at different positions in the line of data then return how many lines contain these 2 bits of text.
So for example I want to know how many lines in the file has the word "bob" where "b" is position 1, "o" is position 2 and "b" is position 3. then in the same line I need to also find "sam" at position 30,31 & 32. I only want to count the row if both elements are present and in the correct positions.


September 9, 2015 at 10:27 am

I would attempt to use the Select-String and Group-Object cmdlets to solve this.

Select-String online help

Group-Object online help

September 9, 2015 at 11:24 am

It's pretty easy to do with RegEx.

$pattern = [regex]"^bob.{26}sam"
$count = 0
Get-Content -Path .\data.txt | foreach {
    if ($_ -match $pattern) { $count++ }
"Found $count hits"

The dot after bob is a wildcard that matches any character, and we will match exactly 26, which pushes sam out to column 30 (1 + 3 + 26). Obviously the more precise you can get about the match, the better. For instance, not knowing what else might be on the line, this particular regex will match sam or samuel as long as it starts in column 30.

September 10, 2015 at 5:44 am

That worked famously!