my goal : I need to search for content inside the and of pages in a website for text strings. the problem is, often those strings are found in headers, footers and meta, and our site being over 100k pages, the results are staggering.
My goal with this script is to filter out the pages with a multi-line regular expression, and only the pages that return a result, get select-string'ed again to give me line by line results. Bonus marks, if we can ignore the line numbers before and after
This is my code so far : (for some reason, I cant put more than 9 lines, please use the link)
Errors I get running it are
Any insight is greatly appreciated!
The second error happens because $result is null and you cannot pipe null to Export-CSV.
$result is null because of the first error. This error is occurring because the ForEach-Object cmdlet is trying to interpret $filter as parameters but $filter does not contain any parameters.
My advice would be to go back to basics. Get 1 file that you know contains the data you want to search for and 1 file that you know doesn't contain the data. Work on the Select-String queries you need to pull out the data you want.
Once you're happy with that, work out the code for the file handling.
You must be logged in to reply to this topic.