How to take specific words as objects from the text

This topic contains 4 replies, has 4 voices, and was last updated by Profile photo of Dave Wyatt Dave Wyatt 3 years, 10 months ago.

  • Author
    Posts
  • #9823
    Profile photo of Rolandas Brazauskas
    Rolandas Brazauskas
    Participant

    Hello guys,

    I've stuck on Powershell on scripts with text.
    Here is the situation..
    I have such configuration file config.txt with text:
    -host "computer01.domain.net" -os "microsoft i386 wNT-5.2-S" -da A.06.20 -autodr A.06.20
    -host "computer02.domain.net" -os "microsoft i386 wNT-5.2-S" -mac 0015170ea10f -da A.06.20 -autodr A.06.20
    -host "computer03.domain.net" -os "microsoft amd64 wNT-6.1-S" -da A.06.20 -autodr A.06.20
    -host "computer04.domain.net" -os "microsoft amd64 wNT-6.1-S" -mac e83935c1a316 -da A.06.20 -autodr A.06.20
    -host "computer05.domain.net" -os "microsoft amd64 wNT-6.1-S" -da A.06.20 -autodr A.06.20
    -host "computer06.domain.net" -os "microsoft i386 wNT-5.2-S" -da A.06.20 -autodr A.06.20
    -host "computer07.domain.net" -os "gpl x86_64 linux-2.6.18-194.el5" -core A.07.00 -integ A.07.00 -da A.07.00 -cc A.07.00 -oracle8 A.07.00

    I need to take only computer names from this file and list like this:
    computer01.domain.net
    computer02.domain.net
    computer03.domain.net
    ...
    computer07.domain.net

    but I have not much ideas how to do this.
    I'm using script with split command for this:

    $file = Get-Content C:\temp\config.txt
    $file2 = $file -split '-host "'
    $file3 = $file2 -split '" -os
    '

    where I get text like this:

    computer01.domain.net
    "microsoft i386 wNT-5.2-S" -da A.06.20 -autodr A.06.20

    computer02.domain.net
    "microsoft i386 wNT-5.2-S" -mac 0015170ea10f -da A.06.20 -autodr A.06.20

    computer03.domain.net
    "microsoft amd64 wNT-6.1-S" -da A.06.20 -autodr A.06.20

    computer04.domain.net
    "microsoft amd64 wNT-6.1-S" -mac e83935c1a316 -da A.06.20 -autodr A.06.20

    computer05.domain.net
    "microsoft amd64 wNT-6.1-S" -da A.06.20 -autodr A.06.20

    computer06.domain.net
    "microsoft i386 wNT-5.2-S" -da A.06.20 -autodr A.06.20

    computer07.domain.net
    "gpl x86_64 linux-2.6.18-194.el5" -core A.07.00 -integ A.07.00 -da A.07.00 -cc A.07.00 -oracle8 A.07.00

    Now I can take server names by assigning variable for every 3rd row, but I understand that this is a hard way to get final result 🙂
    From my script you can see that I'm new in scripting, but I hope you can give me some advises how to get specific words as objects from the text. I only found how to make whole string as object.
    Thank you in advance!

  • #9826
    Profile photo of Matt Tilford
    Matt Tilford
    Participant

    You are pretty much on the right track with what you have. What is causing you a bit of a headache is those silly quote marks. If you change your first line to this

    $file = (Get-Content C:\temp\config.txt) -replace '“','"' -replace '”','"'

    then you will get some nice consistent characters to work with.

    $file2 = $file -split ‘-host “‘

    would then be able to be done with -split '"' and give you a much nicer array to work with. The nicest way to split would be using a regular expression, but as you are just starting into scripting you want to avoid those dark arts.

    You also need to process each line in turn which a foreach loop will do nicely.

    $file = (Get-Content C:\temp\config.txt) -replace '“','"' -replace '”','"'
    foreach ($line in $file) {
        ($computername = $line -split ‘"‘)[1]
    }

    So that will process each line in turn, split the data on the normal quote marks that we put in and create an array. The first element of that array is -host. The second is the computer name itself. [1] will get the second element in the array ([0] gets the first).

  • #9828
    Profile photo of Dave Wyatt
    Dave Wyatt
    Moderator

    Oddly, I couldn't find a ready-made generic tokenizer for .NET or PowerShell (which makes me want to write one soon). However, since your text file's syntax basically matches a PowerShell command line's quoting rules, we can cheat and use the PSParser class:

    $filePath = 'c:\temp\config.txt'
    
    Get-Content -Path $filePath |
    ForEach-Object {
        $tokens = [System.Management.Automation.PSParser]::Tokenize($_, [ref]$null)
        
        $fetchNextToken = $false
        foreach ($token in $tokens)
        {
            if ($fetchNextToken)
            {
                Write-Output $token.Content
                $fetchNextToken = $false
            }
            elseif ($token.Content -eq '-host')
            {
                $fetchNextToken = $true
            }
        }
    }
    

    Edit: I was assuming that the "smart quotes" in your post came from either the forum software or Microsoft Word or something. If they're actually in your data file, then it might throw off the PSParser as well.

  • #9831
    Profile photo of Bob McCoy
    Bob McCoy
    Participant

    This is also relatively easy to accomplish with a regex.

    $hosts = Get-Content .\hostfile.txt
    foreach ($computer in $hosts)
    {
        $computer -match '-host "(?.*?)"' | Out-Null
        $Matches['hostname']
    }

    I will also echo Dave's comments about the quote marks in the original post. The above works with standard quotes (ASCII 34).

  • #9832
    Profile photo of Dave Wyatt
    Dave Wyatt
    Moderator

    True, as long as the hostname is guaranteed to always have double quotes around it (though you could modify the regex to handle both cases).

You must be logged in to reply to this topic.