Author Posts

October 11, 2017 at 2:39 pm

Hello. I have a text file that looks like this:
{
"something" "else"
"even" "2"
"moretext" "704 1696 -40"
}
{
"text" "random"
"odd" "1"
"never" "more"
}
...

Basically it's a series of multiline blocks enclosed in curly brackets. There can be any number of line between the brackets.
What I need is to search this file for a string and if the string is found then remove the block(s) in curly brackets (including them) that contain it. So for example if I search for 'more', it should delete the whole block:
{
"text" "random"
"odd" "1"
"never" "more"
}
If I search for '1', it should remove both blocks.

I'm sure it can be done with regex, but my regex skills are too low and I can't figure this out.
Any help is appreciated.

October 11, 2017 at 2:56 pm

Short answer:

$($(get-content sample.txt) -join '').split('}').trimstart('{') | where-object {$_ -notlike "*random*"}

long answer:
get-content will import your text file with each line as an item in an array.

get-content sample.txt

{
"something" "else"
"even" "2"
"moretext" "704 1696 -40"
}
{
"text" "random"
"odd" "1"
"never" "more"
}

Step one would be to use -join to make one long string out of the input.

$(get-content sample.txt) -join ''

{"something" "else""even" "2""moretext" "704 1696 -40"}{"text" "random""odd" "1""never" "more"}

Step 2 would be to then turn that string back into an array by splitting on the closing curly brace.

$($(get-content sample.txt) -join '').split('}')

{"something" "else""even" "2""moretext" "704 1696 -40"
{"text" "random""odd" "1""never" "more"

Step 3 we clean up the opening curly brace

$($(get-content sample.txt) -join '').split('}').trimstart('{')

"something" "else""even" "2""moretext" "704 1696 -40"
"text" "random""odd" "1""never" "more"

Step 4 we use where-object to filter out any items that have the magic word

$($(get-content sample.txt) -join '').split('}') | where-object {$_ -notlike "*random*"}

"something" "else""even" "2""moretext" "704 1696 -40"

October 11, 2017 at 3:44 pm

Jeremy, thank you,that works.
I need to preserve the original file format though (haven't mentioned that explicitly above), how can I achieve that with your code?

October 11, 2017 at 3:54 pm

Hi John,

You should use a RegEx like this:

\{(.|\n)*?more(.|\n)*?\}

In this case if it finds the word "more" it will mark the whole text within {}; however, it matches even part of the word as well. In your example, it will select both since the first group has the word "moretext" and the second one "more"

October 11, 2017 at 4:13 pm

Leandro, I'm not getting any matches using your regex. Maybe because Get-Content loads the file as array of lines?
edit: but yes, it should include partial matches too, my example above was incorrect, I didn't notice both block contained 'more'

October 11, 2017 at 6:10 pm

Yeah, Get-Content reads the file as an array and the regex works for a full text per say; try doing a

Get-Content -Path FILE_PATH -Raw

October 11, 2017 at 9:53 pm

You are going to run into a problem where it crosses multiple blocks with the \{(.|\n)*?more(.|\n)*?\} pattern.

For example: \{(.|\n)*?never(.|\n)*?\} matches

{
"something" "else"
"even" "2"
"moretext" "704 1696 -40"
}
{
"text" "random"
"odd" "1"
"never" "more"
}

Rather than just

{
"text" "random"
"odd" "1"
"never" "more"
}

What I would do is first find all of my blocks, then filter out the ones I don't want, then join all the remaining blocks back together.

Somthing like this:

cls
$exclude = "odd"
((Get-Content -Path "D:\New Text Document.txt" -Raw | Select-String -Pattern "(?s)\{.*?\}" -AllMatches).matches.value | Select-String -Pattern $exclude -NotMatch) -join "`n"

October 12, 2017 at 12:31 am

One more for your consideration...

$RandomData = @'
{
"something" "else"
"even" "2"
"A new record" "704 1696 -40"
}
{
"something" "else"
"even" "2"
"I want this one" "704 1696 -40"
}
{
"text" "random"
"odd" "1"
"never" "more"
}
{
"something" "else"
"even" "2"
"moretext" "704 1696 -40"
}
{
"text" "random"
"odd" "1"
"never" "And this one also"
}
{
"something" "else"
"even" "2"
"moretext" "704 1696 -40"
}
{
"something" "else"
"even" "2"
"The last record" "704 1696 -40"
}
'@

# Remove all record entries taht match the string 'more'

# Validate pattern match
$RandomData -match '.[^]\b[^?{]*(.*more.*)\b(.|\n)*?\}*}'
# True

# Get all matches
[regex]::Matches($RandomData,'.[^]\b[^?{]*(.*more.*)\b(.|\n)*?\}*}').Value

{
"text" "random"
"odd" "1"
"never" "more"
}
{
"something" "else"
"even" "2"
"moretext" "704 1696 -40"
}
{
"something" "else"
"even" "2"
"m

# Remove matches from the the data
$RandomData -replace '.[^]\b[^?{]*(.*more.*)\b(.|\n)*?\}*}'

{
"something" "else"
"even" "2"
"A new record" "704 1696 -40"
}
{
"something" "else"
"even" "2"
"I want this one" "704 1696 -40"
}
{
"text" "random"
"odd" "1"
"never" "And this one also"
}
{
"something" "else"
"even" "2"
"The last record" "704 1696 -40"
}

October 12, 2017 at 3:12 am

Nice, I never even considered using -replace, but it makes a lot of sense.

Here is another regex that could be used with -replace and takes less steps to process.

$exclude = "odd"
(Get-Content -Path "D:\New Text Document.txt" -Raw) -replace "{[^\}]*$exclude[^\}]*}(\r\n|\n)"

October 12, 2017 at 6:16 am

Thanks everyone for the input. I went with Curtis' code in the end, works well.
At least I got some regexes to study.