Text File Pattern

This topic contains 9 replies, has 4 voices, and was last updated by  Bob McCoy 1 year, 11 months ago.

  • Author
    Posts
  • #30933

    Ernesto Lombardi
    Participant
    Get-Item '.\file.txt'|Select-String -Pattern "$Test"
     

    This Returns

    [ 2015.10.10 10:00:00 ] Name > Message about $Test

    I would like to store the information on each line in variables.

    $Date = [ 2015.10.10 10:00:00 ]

    $Message = everything before $Test

  • #30934

    Curtis Smith
    Participant
    $line = '[ 2015.10.10 10:00:00 ] Name > Message about $Test'
    
    $Date = "$(($line -split "]")[0])]"
    $Message = (($line -split "]")[1] -split '\$')[0]
    
    $Date
    $Message
    [ 2015.10.10 10:00:00 ]
     Name > Message about 
    
  • #30936

    Jonathan Warnken
    Participant

    Here is another take on how to do it

    $testfile = "d:\test\test.txt"
    '[ 2015.10.10 10:00:00 ] Name > Message about Something'|Out-File $testfile -Append
    '[ 2015.10.10 10:00:01 ] Name > Message about Nothing'|Out-File $testfile -Append
    '[ 2015.10.10 10:00:02 ] Name > Message about Something'|Out-File $testfile -Append
    $test = "Something"
    $results = Get-Content $testfile|Select-String $test
    foreach($result in $results){
        [PSCustomObject]@{
        'Date' = "$(($result.tostring() -split "]")[0])]"
        'Message' = (($result.tostring() -split "]")[1] -split "$test")[0]
        }
    }
    Date                    Message               
    ----                    -------               
    [ 2015.10.10 10:00:00 ]  Name > Message about 
    [ 2015.10.10 10:00:02 ]  Name > Message about 
    

    One thing to note that is different from Curtis's example, is how the split for the message is done. The 1st example is using the $ character to split the text out and my example uses the string stored in the $test variable. This is an important distinction if you need to support searching for special characters like $.

    Also if you are going to be doing this for large amounts of data setup a filter and use the pipeline to process the results

    filter split-test{
        
        [CmdletBinding()]
        Param
        (
            [Parameter(Mandatory=$true,
            ValueFromPipeline=$true,
            ValueFromPipelineByPropertyName=$true)]
            [String]$text,
            [Parameter(Mandatory=$true,
            ValueFromPipelineByPropertyName=$true)]
            [String]$split
        )
        Process{
            [PSCustomObject]@{
            'Date' = "$(($text.tostring() -split "]")[0])]"
            'Message' = (($text.tostring() -split "]")[1] -split "$split")[0]
            }
        }
    }
    $testfile = "d:\test\test.txt"
    '[ 2015.10.10 10:00:00 ] Name > Message about Something'|Out-File $testfile -Append
    '[ 2015.10.10 10:00:01 ] Name > Message about Nothing'|Out-File $testfile -Append
    '[ 2015.10.10 10:00:02 ] Name > Message about Something'|Out-File $testfile -Append
    $test = 'Something'
    Get-Content $testfile|Select-String $test|split-test  -split $test
    Date                    Message
    ----                    -------
    [ 2015.10.10 10:00:00 ]  Name > Message about
    [ 2015.10.10 10:00:02 ]  Name > Message about
    
  • #30937

    Curtis Smith
    Participant

    Good point Jonathan, I was just taking the return string literally. I didn't consider the possibility that $test could be representative of a variable. Good catch.

  • #30938

    Bob McCoy
    Participant

    Ernesto, we have problem. We don't trust your example data. Is "$Test" a literal or a variable that's interpreted when the script is run. For instance, I have a third take on what you're trying to accomplish.

    Please post 3-5 lines of real world data, sanitized as needed. Don't post a script extract as a substitute for data. Post the data. And then post what you're expecting for the corresponding output.

    We're just sort of guessing right now and that's going to waste time.

    And just for clarification, what do you mean by, "I would like to store the information on each line in variables"?

    Also, what version of PowerShell are you running? It matters in how we collect the results.

  • #30939

    Bob McCoy
    Participant

    Here's my take (guess) on what you're after.

    #requires -version 3
    # generate simulated input file
    @"
    [ 2015.10.10 10:00:00 ] Name > Message about Something
    [ 2015.10.10 10:00:01 ] Name > Message about Nothing
    [ 2015.10.10 10:00:02 ] Name > Message about Something
    "@ | Out-File -FilePath .\data.txt
    
    $pattern = "^(?'datetime'\[.+\])\s(?'message'.*)$"
    $results = foreach ($line in (Get-Content -Path .\data.txt)) {
        if ($line -match $pattern)
        {
            [PSCustomObject]@{
                TimeStamp = $Matches['datetime']
                Message = $Matches['message']
            }
        }
    }
    
    # display output collection
    $results
  • #30957

    Ernesto Lombardi
    Participant

    [ 2015.10.16 15:20:22 ] Jim > are you near location II-32?
    [ 2015.10.16 15:23:50 ] Joe > How is everyone to night?
    [ 2015.10.16 15:25:45 ] Bob > I am by II-32.

    The information is coming from chat logs
    I am $Test is a variable I would like to use to be a key word identifier.

    In these 3 lines I am not concerned with line 2.
    But II-32 is my trigger word here.

    "^(?'datetime'\[.+\])\s(?'message'.*)$"

    I am still a little confused about how these patterns work. I have read powershell in a month of lunches and chapter 24 I think is the regular expressions.

    I just do not follow the syntax.

  • #30958

    Curtis Smith
    Participant

    What you are looking at there, "^(?'datetime'\[.+\])\s(?'message'.*)$", is some really nice usage of Regular Expression Pattern Matching. Bob is obviously a master at it. I don't even understand all of what he is doing there. It appears to be matching a pattern that begins with an Open Bracket [, has some data in the middle and then ends with a Close Bracket ]. It gives that matched pattern a label of datetime, It then matches whitespace between that pattern and the next, then it looks for another pattern that ends with all the remaining characters and labels it as message. I've never seen labeling of the different pattern match sections before, so I may even be using the wrong terminology, but that's pretty cool stuff.

  • #30959

    Bob McCoy
    Participant

    Yes, it's called a labeled capture. Otherwise in the code it would have been $Matches[1] and $Matches[2]. In this sized script that's not too onerous. But as a rule I prefer labeled captures so when I'm using it in the PowerShell code I'm not trying to remember at what position a capture occurred in the overall pattern.

    And at reading the additional information from Ernesto I would probably pursue this differently now with a select-string and a simple split. Still not quite sure what he's expecting for output.

  • #30960

    Bob McCoy
    Participant

    Here's a revised version based on the additional input.

    #requires -version 3
    # generate simulated input file
    @"
     [ 2015.10.16 15:20:22 ] Jim > are you near location II-32?
     [ 2015.10.16 15:23:50 ] Joe > How is everyone to night?
     [ 2015.10.16 15:25:45 ] Bob > I am by II-32.
    "@ | Out-File -FilePath .\data.txt
    
    $pattern = "\[\s(?'datetime'.+)\s\]\s(?'message'.*)$"
    
    # sample search string, may come from read-host, paramater, args, etc.
    $searchString = 'II-32'
    
    $data = Get-Content -Path .\data.txt | Select-String -Pattern $searchString
    $results = foreach ($line in $data) {
        if ($line -match $pattern)
        {
            [PSCustomObject]@{
                TimeStamp = $Matches['datetime']
                Message = $Matches['message']
            }
        }
    }
    
    # display output collection
    $results
    

You must be logged in to reply to this topic.