Regex for extracting Data

This topic contains 14 replies, has 7 voices, and was last updated by  Curtis Smith 2 weeks, 6 days ago.

  • Author
    Posts
  • #76082

    Amar Helloween
    Participant

    Hello friends,

    I am trying to write a script in powershell to capture some eventlog and extract the no.of times a domain got recycled.

    Sample eventlog :

    I have kept this log in a Variable say $Data.
    Now I am looking for output something as below by parsing $Data variable :

    Domain Name No.of Times recycled
    —————– ————————–
    a.org 2
    b.com 1
    c.co.in 3
    d.com 1
    s.org 1
    se.com 1
    b.ac.in 1

    Kindly provide your suggestion on how this can be achieved.

  • #76105

    js
    Participant

    Here's a lazy attempt.

    $counts = @{}
    
    echo a.org,b.com,c.co.in,d.com,s.org,se.com,b.ac.in | 
    foreach { 
      $counts[$_] = (select-string $_ log).count
    }
        
    $counts
    
    
    Name                           Value
    ----                           -----
    s.org                          1
    a.org                          2
    b.com                          1
    c.co.in                        3
    se.com                         1
    d.com                          1
    b.ac.in                        1
    
  • #76108

    Amar Helloween
    Participant

    Thanks for the suggestion, Actually the above one is just a sample having 7 domain names, but in actual scenario it may have more than 300+ domains, so it will be difficult to pass each domain name.

    Kindly suggest.

  • #76111

    Will Anderson
    Keymaster

    Can you read the domains from Active Directory?

  • #76112

    js
    Participant

    I was trying to use convertfrom-string, but I can't get it to work. 🙁

    $template = @'
    {Domain*:a.org}
    {Domain*:b.com}
    '@
    
    $testText = @'
    A worker process serving application pool 'a.org(domain)(4.0)(pool)' has requested a recycle because it reached its private bytes memory limit.
    A worker process serving application pool 'b.com(domain)(2.0)(pool)' has requested a recycle because it reached its private bytes memory limit.
    '@
    
    $testText | convertfrom-string -templatecontent $template 
    
    
    Domain
    ------
    A worker process serving application pool 'a.org(domain)(4.0)(pool)' has requested a recycle because it reached its private bytes memory limit.
    A worker process serving application pool 'b.com(domain)(2.0)(pool)' has requested a recycle because it reached its private bytes memory limit.
    
  • #76115

    Amar Helloween
    Participant

    Will Anderson – This domains are actually hosted on IIS, and not all of them will show application pool recycle error. so fetching from AD will not be possible.

  • #76129

    Max
    Participant

    Assuming the format is the same:

    Class domain{
        $name
        $counted = $false
    }
    Class recycleCount{
        $name
        $count
    }
    $log = Get-Content C:\temp\temp.log
    $arrayDom = @()
    $arrayFinal = @()
    
    $log | %{if($_.contains("has requested a recycle")){
                    $start = $_.indexof("'");
                    $end = $_.indexof("(");
                    $objDom = New-Object domain;
                    $objDom.name = $_.Substring($start+1,$end-$start-1);
                    $arrayDom += $objDom;
                    }
               }
    $arrayDom | %{$count = 0;
                if($_.counted -eq $false){
                    $name = $_.name;
                    $_.counted = $true;
                    $count += 1;
                    $arrayDom | %{if(($_.name -eq $name) -and ($_.counted -eq $false)){
                                $_.counted = $true;
                                $count+=1;
                                }
                            };
                            $objCnt = New-Object recycleCount;
                            $objCnt.name = $name
                            $objCnt.count = $count
                            $arrayFinal += $objCnt
                    }
              }
    $arrayFinal
    
  • #76150

    Tim Curwick
    Participant

    Amar,

    Try something like this:

    $Data |
    Where-Object { $_ -like '*has requested a recycle*' } |
    ForEach-Object { $_.Split( "'(" )[1] } |
    Group-Object |
    Select-Object -Property Name, Count

    • #76159

      js
      Participant

      Nice! I tried that with -split.

      $counts = @{} # associative array
      select-string 'has requested a recycle' log |
      foreach {
        $org = ($_ -split {$_ -in "'",'(' })[1]
        $counts[$org]++
      }
      $counts
      
      
      Name                           Value                                           
      ----                           -----                                           
      b.com                          1                                               
      s.org                          1                                               
      c.co.in                        3                                               
      d.com                          1                                               
      b.ac.in                        1                                               
      a.org                          2                                               
      se.com                         1                                               
      
    • #76177

      js
      Participant

      Here's a way to convert that $counts hashtable I made to an object:

      foreach ( $key in $counts.keys ) {
        echo '' | select @{name='domain';expression={$key}},
          @{name='times';expression={$counts[$key]}} }
      
      
      domain  times
      ------  -----
      s.org       1
      a.org       2
      b.com       1
      c.co.in     3
      se.com      1
      d.com       1
      b.ac.in     1
      
  • #76163

    random commandline
    Participant

    If you are not sure how large this log file will be, I recommend using switch statement.

    # Match and add each domain name to list
    $log = Get-ChildItem .\event.log
    $nobj = New-Object System.Collections.ArrayList
    
    switch -regex -File $log {
        "pool '(?'dn'.*)\(domain\).*requested a recycle" 
        {[void]($nobj.add($Matches['dn']))}
    }
    
    # Display number of domain name
    $nobj | Group-Object | Select-Object Name,Count
    
    • #76165

      js
      Participant

      What does this part mean? Somehow .* becomes $matches['dn']?

      (?'dn'.*)
      
  • #76168

    random commandline
    Participant

    '(?'dn'.*)' is a named capture group. It will capture the domain name or in this case all text between the words 'pool' and 'domain.' Once captured, a hashtable ($Matches) is created.

    () = text in parentheses will be captured
    '.*' = any number of characters
    ?'dn' = give the capture group the name 'dn'

    • #76169

      js
      Participant

      I was reading about capture groups here, but the format used greater than and less than signs (this forum can't show them) instead of single quotes. https://ss64.com/ps/syntax-regex.html

  • #76193

    Curtis Smith
    Participant

    They are both valid formats

    IE

    $inputval = 'abcdefghijklmnopqrstuvwxyz'
    
    $inputval -match "(?.*)e.*v(?'afterV'.*)$"
    
    $Matches

    Results:

    True
    
    Name                           Value
    ----                           -----                                                                                                                                                                                                                             
    afterV                         wxyz
    beforeE                        abcd
    0                              abcdefghijklmnopqrstuvwxyz

    There is also another format (?P.*), but this format is not supported by .NET, and subsequently not supported in PowerShell.
    Ref: http://www.regular-expressions.info/refext.html

You must be logged in to reply to this topic.