Welcome › Forums › General PowerShell Q&A › Regex for extracting Data
This topic contains 14 replies, has 7 voices, and was last updated by
-
AuthorPosts
-
July 28, 2017 at 12:44 pm #76082
Hello friends,
I am trying to write a script in powershell to capture some eventlog and extract the no.of times a domain got recycled.
Sample eventlog :
I have kept this log in a Variable say $Data.
Now I am looking for output something as below by parsing $Data variable :Domain Name No.of Times recycled
—————– ————————–
a.org 2
b.com 1
c.co.in 3
d.com 1
s.org 1
se.com 1
b.ac.in 1Kindly provide your suggestion on how this can be achieved.
-
July 28, 2017 at 2:01 pm #76105
Here's a lazy attempt.
$counts = @{} echo a.org,b.com,c.co.in,d.com,s.org,se.com,b.ac.in | foreach { $counts[$_] = (select-string $_ log).count } $counts Name Value ---- ----- s.org 1 a.org 2 b.com 1 c.co.in 3 se.com 1 d.com 1 b.ac.in 1
-
July 28, 2017 at 2:22 pm #76108
Thanks for the suggestion, Actually the above one is just a sample having 7 domain names, but in actual scenario it may have more than 300+ domains, so it will be difficult to pass each domain name.
Kindly suggest.
-
July 28, 2017 at 2:27 pm #76111
Can you read the domains from Active Directory?
-
July 28, 2017 at 2:36 pm #76112
I was trying to use convertfrom-string, but I can't get it to work. 🙁
$template = @' {Domain*:a.org} {Domain*:b.com} '@ $testText = @' A worker process serving application pool 'a.org(domain)(4.0)(pool)' has requested a recycle because it reached its private bytes memory limit. A worker process serving application pool 'b.com(domain)(2.0)(pool)' has requested a recycle because it reached its private bytes memory limit. '@ $testText | convertfrom-string -templatecontent $template Domain ------ A worker process serving application pool 'a.org(domain)(4.0)(pool)' has requested a recycle because it reached its private bytes memory limit. A worker process serving application pool 'b.com(domain)(2.0)(pool)' has requested a recycle because it reached its private bytes memory limit.
-
July 28, 2017 at 2:44 pm #76115
Will Anderson – This domains are actually hosted on IIS, and not all of them will show application pool recycle error. so fetching from AD will not be possible.
-
July 28, 2017 at 5:04 pm #76129
Assuming the format is the same:
Class domain{ $name $counted = $false } Class recycleCount{ $name $count } $log = Get-Content C:\temp\temp.log $arrayDom = @() $arrayFinal = @() $log | %{if($_.contains("has requested a recycle")){ $start = $_.indexof("'"); $end = $_.indexof("("); $objDom = New-Object domain; $objDom.name = $_.Substring($start+1,$end-$start-1); $arrayDom += $objDom; } } $arrayDom | %{$count = 0; if($_.counted -eq $false){ $name = $_.name; $_.counted = $true; $count += 1; $arrayDom | %{if(($_.name -eq $name) -and ($_.counted -eq $false)){ $_.counted = $true; $count+=1; } }; $objCnt = New-Object recycleCount; $objCnt.name = $name $objCnt.count = $count $arrayFinal += $objCnt } } $arrayFinal
-
July 28, 2017 at 6:57 pm #76150
Amar,
Try something like this:
$Data |
Where-Object { $_ -like '*has requested a recycle*' } |
ForEach-Object { $_.Split( "'(" )[1] } |
Group-Object |
Select-Object -Property Name, Count-
July 28, 2017 at 11:20 pm #76159
Nice! I tried that with -split.
$counts = @{} # associative array select-string 'has requested a recycle' log | foreach { $org = ($_ -split {$_ -in "'",'(' })[1] $counts[$org]++ } $counts Name Value ---- ----- b.com 1 s.org 1 c.co.in 3 d.com 1 b.ac.in 1 a.org 2 se.com 1
-
July 29, 2017 at 8:13 pm #76177
Here's a way to convert that $counts hashtable I made to an object:
foreach ( $key in $counts.keys ) { echo '' | select @{name='domain';expression={$key}}, @{name='times';expression={$counts[$key]}} } domain times ------ ----- s.org 1 a.org 2 b.com 1 c.co.in 3 se.com 1 d.com 1 b.ac.in 1
-
-
July 29, 2017 at 12:01 pm #76163
If you are not sure how large this log file will be, I recommend using switch statement.
# Match and add each domain name to list $log = Get-ChildItem .\event.log $nobj = New-Object System.Collections.ArrayList switch -regex -File $log { "pool '(?'dn'.*)\(domain\).*requested a recycle" {[void]($nobj.add($Matches['dn']))} } # Display number of domain name $nobj | Group-Object | Select-Object Name,Count
-
July 29, 2017 at 1:44 pm #76165
What does this part mean? Somehow .* becomes $matches['dn']?
(?'dn'.*)
-
-
July 29, 2017 at 2:20 pm #76168
'(?'dn'.*)' is a named capture group. It will capture the domain name or in this case all text between the words 'pool' and 'domain.' Once captured, a hashtable ($Matches) is created.
() = text in parentheses will be captured
'.*' = any number of characters
?'dn' = give the capture group the name 'dn'-
July 29, 2017 at 3:00 pm #76169
I was reading about capture groups here, but the format used greater than and less than signs (this forum can't show them) instead of single quotes. https://ss64.com/ps/syntax-regex.html
-
-
July 30, 2017 at 3:14 am #76193
They are both valid formats
IE
$inputval = 'abcdefghijklmnopqrstuvwxyz' $inputval -match "(?.*)e.*v(?'afterV'.*)$" $Matches
Results:
True Name Value ---- ----- afterV wxyz beforeE abcd 0 abcdefghijklmnopqrstuvwxyz
There is also another format (?P.*), but this format is not supported by .NET, and subsequently not supported in PowerShell.
Ref: http://www.regular-expressions.info/refext.html -
AuthorPosts
The topic ‘Regex for extracting Data’ is closed to new replies.