Author Posts

June 3, 2018 at 9:33 am

Hi All

I have created a PowerShell script to read a web page and extract the table from the webpage into a csv. There are multiple web pages from where I extract data, but the table extraction process is the same.

1. I tried a lot of things but I am not able to populate the $ResultCSV object. when I try to print the $ResultCSV after this step

 $ResultCSV += $row 

System.object as the error

2. How to make this code better. I have spent all of yesterday and today but I am stuck in a rut. Thanks for all the help

Here is the code for reference.

function Get-ControlMJobInfo {
    param (
    [string]$URL="" 
    )

    $a = iwr $URL
    $ResultCSV = @()  
    $TempObj = "C:\Users\nrao1\Desktop\temp"
    ($a.AllElements[0].innerText)  | set-content $TempObj

    $Controlobject = Get-Content $TempObj
     
    #The data extract has the header info at line 2 and is | delimited. I am storing all the headers in a variable.
    $FieldHeaderArray = ((Get-Content $TempObj -TotalCount 2)[-1]).split("|").Trim()

    # I am capturing the no of fields in the table.
    $NoofFields= ((Get-Content $TempObj -TotalCount 2)[-1]).split("|").Count

    foreach ($line in $Controlobject) {
    
        $Fields = $line.split("|").Trim()
        #The table data has some info which is not required like some text. I want only the delimited table data
        If($Fields.count -eq $NoofFields) {
    
            $row = New-Object System.Object # Create an object to append to the array
                #Creating the headers dynamically based on the $FieldHeaderArray captured above
                For($i=0;$i -lt $NoofFields;$i++) {

                    $row | Add-Member -MemberType NoteProperty -Name $FieldHeaderArray[$i] -Value $Fields[$i] -Force
              
                   }

            $ResultCSV += $row
            $i=0

            }
    }

    $Controlobject = ""
    }

#Passing Two URL to the function to extract table into CSV
Get-ControlMJobInfo "http://random1"
#Storing the csv object in a new object so that it does not get overrun when the next function gets called
$ResultCSV1 = $Resultcsv
Get-ControlMJobInfo "http://random2"