Author Posts

April 3, 2017 at 8:26 pm

Hello,

Newb here and if possible i would like some assistance with a powershell script. I need to be able to read through a text file and extract 4 pieces of information from each line and export it to a text file. My current script adds and removes a bunch of regex to each line in order to remove everything I don't need but unfortunately i have to keep changing it every month.

Example of the Text below:
mrc_app_access_20170224.log: JOBID2680528370: MRC_MAIN.S00846s-BatchJob ended: Fri Feb 24 04:20:53 EST 2017. time used: 1248422ms ****

The ideal output: 20170224 S00846 BatchJob 04:20:53

The rest of the data is garbage for my purposes. I found some helpful info here https://social.technet.microsoft.com/Forums/windowsserver/en-US/77c03443-927b-43a3-a27b-77a6c8bd238c/powershell-read-only-specific-parts-of-a-line-in-a-txt-file?forum=winserverpowershell
But I am not versed enough in powershell to make it work for my needs.

Any powershell gurus out there know of a easy solution for me? Any and all help is appreciated.

Thanks in advance

April 3, 2017 at 9:00 pm

$myInput = 'mrc_app_access_20170224.log: JOBID2680528370: MRC_MAIN.S00846s-BatchJob ended: Fri Feb 24 04:20:53 EST 2017. time used: 1248422ms ****'

$myOutput = New-Object -TypeName PSObject -Property @{
    # Identified Date as the 8 characters preceeding '-log:'
    Date     = $myInput.Substring($myInput.IndexOf('.log:')-8,8)
    # Identified BatchJob as the 7 characters preceeding '-BatchJob'
    BatchJob = $myInput.Substring($myInput.IndexOf('-BatchJob')-7,7)
    # Identified Ended as the 8 characters that start 12 characters after 'ended:'
    Ended    = $myInput.Substring($myInput.IndexOf('ended:')+18,8)
}

$myOutput

April 3, 2017 at 11:51 pm

Sam,

Would I have to place that code into a foreach loop? If so can you show me as I am not very good at those? And also am I correct in my assumption that I can change the number in the indexof(x,x) for my needs? The location where you see batch jobs is usually usernames and I would need to capture one more variable if not more.

April 4, 2017 at 12:34 am

You can put this into a loop if you have multiple text files.

$file = Get-ChildItem 'C:\users\user99\desktop\MRC Access Logs\combinedlogs.txt'

$appusage = switch -Regex -File $file {
'_(\d{8}).*MAIN.(.*)-(.*) [se].*(\d{2}:\d{2}:\d{2})' 
    {    [PSCustomObject]@{
            Date = $Matches[1]
            BatchJob = $Matches[2]
            UserName = $Matches[3]
            End = $Matches[4]}
    }
}

$appusage | Out-File "C:\users\mvincenty\desktop\MRC Access Logs\appusage.txt"
# If you prefer a csv file
$appusage | Export-Csv "C:\users\mvincenty\desktop\MRC Access Logs\appusage.csv" -NoTypeInformation

April 4, 2017 at 1:23 am

in a loop:

$myInput = Get-Content .\myfile.txt # Array input

$myOutput = $myInput | % {
    New-Object -TypeName PSObject -Property @{
        # Identified Date as the 8 characters preceding '-log:'
        Date     = $myInput.Substring($myInput.IndexOf('.log:')-8,8)
        # Identified BatchJob as the 7 characters preceding '-BatchJob'
        BatchJob = $myInput.Substring($myInput.IndexOf('-BatchJob')-7,7)
        # Identified Ended as the 8 characters that start 12 characters after 'ended:'
        Ended    = $myInput.Substring($myInput.IndexOf('ended:')+18,8)
    }
}

$myOutput # Array output

April 4, 2017 at 2:12 am

I have modified my code to the below:

$textfile1 = "C:\users\user99\desktop\MRC Access Logs\combinedlogs.txt"

$myInput = $textfile1 # Array input

$myOutput = $myInput | % {
    New-Object -TypeName PSObject -Property @{
        # Identified Date as the 8 characters preceding '-log:'
        Date     = $myInput.Substring($myInput.IndexOf('.log:')-8,8)
        # Identified Application as the 16 characters preceding '-'
        Application = $myInput.Substring($myInput.IndexOf('-')-16,16)
        # Identified Username as the 7 characters following '-'
        $start = $myInput.indexOf("-") + 1
        $end = $myInput.indexOf("d:", $start)
        $length = $end - $start
        Username = $myInput.Substring($start, $lenght)
        # Identified Time as the 8 characters preceding 'EST'
        Time    = $myInput.Substring($myInput.IndexOf('EST')-9,8)
    }
}
# Array output
$myOutput  | Out-File "C:\users\mvincenty\desktop\MRC Access Logs\appusage.txt"

I am getting the error below
Exception calling "Substring" with "2" argument(s): "StartIndex cannot be less than zero.
Parameter name: startIndex"
At line:2 char:5
New-Object -TypeName PSObject -Property @{
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
CategoryInfo : NotSpecified: (:) [], MethodInvocationException
FullyQualifiedErrorId : ArgumentOutOfRangeException

I had to change the line for the username to the above due to the dynamic range of what the username could vary. If any of it looks incorrect please advise. Below is example of the file:
mrc_app_access_20170224.log: JOBID4218313870: MRC_MAIN.R05847-user1 started: Fri Feb 24 08:16:23 EST 2017
mrc_app_access_20170224.log: JOBID4218477858: MRC_MAIN.R05847-user1 started: Fri Feb 24 08:16:24 EST 2017
mrc_app_access_20170224.log: JOBID4218729567: MRC_MAIN.R00907-user23 started: Fri Feb 24 08:16:27 EST 2017
mrc_app_access_20170224.log: JOBID4218729569: MRC_MAIN.I00150-user23 started: Fri Feb 24 08:16:27 EST 2017
mrc_app_access_20170224.log: JOBID4218729569: MRC_MAIN.I00150-user23 ended: Fri Feb 24 08:16:27 EST 2017. time used: 15ms

Thank you for all your assistance.

April 4, 2017 at 4:11 am

Check my previous post for an updated solution.

April 4, 2017 at 12:25 pm

random commandline,

Thanks for the help the changes in your code worked but I am getting additional data that I will need to remove. I was wondering if you could explain how your code works?

In the text version i am seeing where the username or batchjob as user1...(ex. gdonhaus...) while in the csv it is grabbing more such as the wording "(action ...)" or "started:". I would like to modify the code you provided to grab from '-' in '-user etc' until the blank space.

mrc_app_access_20170224.log: JOBID4279904271: MRC_MAIN.MRCView-user11111 (action: WW List) ended: Fri Feb 24 08:26:39 EST 2017. time used: 14ms

April 4, 2017 at 7:58 pm

Ok, replace the old regular expression line with the new. I used a switch statement which reads one line at a time utilizing less memory and processes larger text files faster. The regular expression uses capture groups (parentheses) to match the parts you want. This (.*?) will match the smallest set and not be greedy (which is what happened with your last example).

# Old
'_(\d{8}).*MAIN.(.*)-(.*) [se].*(\d{2}:\d{2}:\d{2})'
# New
'_(\d{8}).*MAIN.(.*)-(.*?) .*(\d{2}:\d{2}:\d{2})' 

April 4, 2017 at 8:13 pm

Awesome.

Thanks for the help guys. This seem to have solved my issue. Now i just have to complete the script by filtering out un-needed application.

Thanks,

April 4, 2017 at 11:11 pm

Random Commandline,

I have 3 questions,

1. In the text version of the output some random username have user... after their name. Is the regex truncating the name and how can i get around this?
2. Can I use the csv same as the text file for the following command?

Get-Content $textfile2 | where-object {$_ -notmatch '.*mrcmenu*' -and $_ -notmatch '.*I00030*' -and $_ -notmatch '.*I00035*' -and $_ -notmatch '.*I00050*' -and $_ -notmatch '.*I00060*' -and $_ -notmatch '.*I00080*' -and $_ -notmatch '.*I00100*' -and $_ -notmatch '.*I00140*' -and $_ -notmatch '.*I00150*' -and $_ -notmatch '.*I00160*' -and $_ -notmatch '.*I00200*' -and $_ -notmatch '.*I00240*' -and $_ -notmatch '.*I00333*' -and $_ -notmatch '.*I00372*' -and $_ -notmatch '.*I00373*' -and $_ -notmatch '.*I00382*' -and $_ -notmatch '.*I00401 *' -and $_ -notmatch '.*I00402*' -and $_ -notmatch '.*I00403*' -and $_ -notmatch '.*I00442*' -and $_ -notmatch '.*I00472*' -and $_ -notmatch '.*I00642*' -and $_ -notmatch '.*I00643*' -and $_ -notmatch '.*I00712*' -and $_ -notmatch '.*I00713*' -and $_ -notmatch '.*I00732*' -and $_ -notmatch '.*I00752*' -and $_ -notmatch '.*I00753*' -and $_ -notmatch '.*I00762*' -and $_ -notmatch '.*I00772*' -and $_ -notmatch '.*I00852*' -and $_ -notmatch '.*I00863*' -and $_ -notmatch '.*I00912*' -and $_ -notmatch '.*I00913*' -and $_ -notmatch '.*I00925*' -and $_ -notmatch '.*I00932*' -and $_ -notmatch '.*I00993*' -and $_ -notmatch '.*I01052*' -and $_ -notmatch '.*I01053*' -and $_ -notmatch '.*I01072*' -and $_ -notmatch '.*I01152*' -and $_ -notmatch '.*I01182*' -and $_ -notmatch '.*I01192*' -and $_ -notmatch '.*I01222*' -and $_ -notmatch '.*I01262*' -and $_ -notmatch '.*I01263*' -and $_ -notmatch '.*I01493*' -and $_ -notmatch '.*I01494*' -and $_ -notmatch '.*I01495*' -and $_ -notmatch '.*I01496*' -and $_ -notmatch '.*I01497*' -and $_ -notmatch '.*I01515*' -and $_ -notmatch '.*I01605*' -and $_ -notmatch '.*I01615*' -and $_ -notmatch '.*I01927*' -and $_ -notmatch '.*I02007*'} | set-content 'C:\users\mvincenty\desktop\MRC Access Logs\appdataimport.txt'

3. How can I shorten the code above in length, time, processing and memory? I will also need it to be flexible as i will need to add more "exclusions" to the list.

April 5, 2017 at 12:34 am

1&2. I recommend using csv files. You don't have to worry about the truncating issue, but this might work.

$appusage | Format-Table -AutoSize | 
Out-File "C:\users\mvincenty\desktop\MRC Access Logs\appusage.txt" 

3. That is a long list. Will this work for you?

(Get-Content $textfile2) -notmatch 'I00|mrcmenu'

April 5, 2017 at 2:00 am

3. Yes and no. I need some of the I's as some are reports the rest are option list. However I could use the S00. Would it be faster if I just used csv format instead?