Author Posts

July 11, 2017 at 8:20 am

As I have been unable to find a current RSS feed for windows updates I am trying to parse some data from the Microsoft support site. I can dynamically build the URL as the only bits that will change are the numbers of the knowledge base article. I am interested in the text below where it say Summary but cant find a method to extract this information with Invoke-Webrequest

an example url is below

$web= Invoke-WebRequest "https://support.microsoft.com/en-us/help/4022887/title#!/en-us/help/4022887/title"

July 11, 2017 at 12:18 pm

So there's a couple of ways you could do it, but before I start jumping down the wrong rabbit hole, would this site get you the data you need?

https://support.microsoft.com/en-us/gp/selectrss?target=rss

July 11, 2017 at 12:24 pm

Hi, Thanks for the reply. I looked at that site rss feeds but it would appear that this is no longer being updated

July 11, 2017 at 12:37 pm

I am trying

$web ="https://support.microsoft.com/en-us/help/4022887/title#!/en-us/help/4022887/title"
$data = invoke-Webrequest $web
$result = $data.ParsedHtml.body.getElementsByClassName('kb-summary-section section ng-scope.x-hidden-focus')

As I only want the information in the summary but nothing is being passed back to $result

July 11, 2017 at 9:45 pm

invoke-restmethod?

July 12, 2017 at 6:12 am

Hi Simon,

it seemed to me that you are doing everything right, but when you output the complete raw result of the request, it doesn't seem to be anything useful. So I tried using the Internet Explorer COM Object through PS and it worked. Not pretty, but gets the result you are looking for:

   $ie = new-object -ComObject "InternetExplorer.Application"
   $ie.silent = $true
   $ie.navigate($web)
   while($ie.busy){ sleep 1 }
   $result = $ie.document.body.getElementsByClassName("kb-summary-section") | select -ExpandProperty innertext
   $ie.quit()

Cheers
Wilm

July 12, 2017 at 7:22 am

Thanks Wilm that does the trick.

July 13, 2017 at 8:10 am

Just when I thought it was safe to go back into the water 🙂
When I use the following :-
$ie = new-object -ComObject "InternetExplorer.Application"
$ie.silent = $true
$web ="https://support.microsoft.com/en-us/help/4022887/title#!/en-us/help/4022887/title"
$ie.navigate($web)
$result = ""
$result = $ie.document.body.getElementsByClassName("container section-body") | select -ExpandProperty innertext
$kbarticle = $result -split "Symptom" | select -first 1
$ws.cells.item($intRow,4) = $kbarticle
$ws.cells.item($intRow,5) = $web

It writes the contents of $kbarticle to the cell in excel (ok I have not included to code to open excel here) but there are 2 carriage returns at the top of the data so in order to see the data you have to click into the cell (I spent hours thinking it wasn't writing the data before I spotted the 2 Carriage returns 🙂 ). I have tried $kbarticle.Trim() but that does not seem to work. Any ideas

July 13, 2017 at 10:20 am

I fixed the issue with

$trimmedkbArticle = $kbarticle.ToString()
$ws.cells.item($intRow,4) = $trimmedkbarticle.Trim()

July 13, 2017 at 1:58 pm

Using invoke-restmethod with an rss feed? This returns an array of [XmlElement]'s.

$a = Invoke-RestMethod https://support.microsoft.com/en-us/rss?rssid=18165
show-object $a  # PowershellCookbook module