Author Posts

March 8, 2016 at 9:15 am

Hi Forum,

my goal : I need to search for content inside the and of pages in a website for text strings. the problem is, often those strings are found in headers, footers and meta, and our site being over 100k pages, the results are staggering.

My goal with this script is to filter out the pages with a multi-line regular expression, and only the pages that return a result, get select-string'ed again to give me line by line results. Bonus marks, if we can ignore the line numbers before and after

This is my code so far : (for some reason, I cant put more than 9 lines, please use the link)

https://anotepad.com/note/read/refdie

Errors I get running it are


ForEach-Object : Cannot convert 'System.Object[]' to the type 'System.Management.Automation.ScriptBlock' required by parameter 'Process'. Specified method is not supported.
At W:\test\york\_tools\menu.ps1:41 char:37
+             $result = ForEach-Object < <<<  $filter {
    + CategoryInfo          : InvalidArgument: (:) [ForEach-Object], ParameterBindingException
    + FullyQualifiedErrorId : CannotConvertArgument,Microsoft.PowerShell.Commands.ForEachObjectCommand
 
Export-Csv : Cannot bind argument to parameter 'InputObject' because it is null.
At W:\test\york\_tools\menu.ps1:44 char:33
+             $result | Export-Csv <<<<  "W:\test\search_results\$name.csv" -NoType
    + CategoryInfo          : InvalidData: (:) [Export-Csv], ParameterBindingValidationException
    + FullyQualifiedErrorId : ParameterArgumentValidationErrorNullNotAllowed,Microsoft.PowerShell.Commands.ExportCsvCommand

Any insight is greatly appreciated!

March 8, 2016 at 10:41 am

The second error happens because $result is null and you cannot pipe null to Export-CSV.

$result is null because of the first error. This error is occurring because the ForEach-Object cmdlet is trying to interpret $filter as parameters but $filter does not contain any parameters.

My advice would be to go back to basics. Get 1 file that you know contains the data you want to search for and 1 file that you know doesn't contain the data. Work on the Select-String queries you need to pull out the data you want.

Once you're happy with that, work out the code for the file handling.