Recording File Names

Welcome Forums General PowerShell Q&A Recording File Names

This topic contains 9 replies, has 3 voices, and was last updated by

PB
 
Participant
2 years, 8 months ago.

  • Author
    Posts
  • #34470
    PB

    Participant
    Points: 0
    Rank: Member

    I have thousands of unique strings that I need to capture from unstructured files.
    These files are stored on 9 x UNC shares. The unique string that I need to capture is actually stored in the file names on these share so this should be possible if you have the knowhow.
    The file names stored on these shares have only two naming formats::
    PN_DDMMYYYY_HHMMSS_PARTNUMBER.txt
    bookings_PARTNUMBER_randomtext.txt

    e.g. \\share1\customer\saleID\ Note that there can be another sub directory again underneath saleID. Each customer has their own directory, each saleID folder is unique.
    I need to capture only every unique PARTNUMBER from these file names on the 9 shares.
    There may be one or more files with the same part number but I only need to record it once i.e. remove duplicates – and save these out to a *.CSV file.
    Help appreciated as I have no clue where to start!

  • #34471

    Participant
    Points: 0
    Rank: Member

    Do ALL the part numbers have a fixed length? Or does it vary?

    And is the part number strictly numeric? Or is it mixed?

  • #34472
    PB

    Participant
    Points: 0
    Rank: Member

    Hi, thanks. Variable length, alpha numeric – but always in the same position of the file name examples given.

  • #34475

    Participant
    Points: 0
    Rank: Member

    List your shares in the $shares variable and it will export a csv to user profile.

    # Get Files    
    $shares = "\\share1\","\\share2\"
    $files = Get-ChildItem -Path $shares -Recurse -Include PN_*.txt,bookings_*.txt
    
    # Match part numbers and add to collection
    $partnumbers = {@()}.Invoke()
    foreach ($file in $files){
    If ($file -match "(bookings_(?'pn'.*)_.*_|bookings_(?'pn'.*)_|PN_.*_(?'pn'.*).txt)") {$partnumbers.Add($Matches.pn)}
    } # End Foreach
    
    # Export part numbers
    $partnumbers | select -Unique | Out-File $env:USERPROFILE\partnumber.csv
    
  • #34476

    Participant
    Points: 0
    Rank: Member

    In this sample I've simulated your converged filename list. I am assuming at this point you know how to get the list of files. I use RegEx to extract the part numbers from the file names per your example and information above. I collect all the part numbers in an array, then I sort it for unique numbers and dump it to a text file.

    $data = @"
    PN_31012016_112345_123456.txt
    bookings_23456_randomtext.txt
    PN_25062015_204500_345.txt
    bookings_6666A6666_randomtext.txt
    PN_02031999_153000_222Q1.txt
    bookings_911zzz_randomtext.txt
    PN_04071776_120000_345.txt
    bookings_456789_randomtext.txt
    "@ -split "`r`n"
    
    $pnPattern = "PN_\d{8}_\d{6}_(.+)\."
    $bookingsPattern = "bookings_(.+)_"
    $partNumber = New-Object -TypeName System.Collections.ArrayList
    
    foreach ($item in $data) {
        Write-Verbose "Working on $item"
        if ($item -match $pnPattern)
        {
            $partNumber.Add($Matches[1]) | Out-Null
        } elseif ($item -match $bookingsPattern)
        {
            $partNumber.Add($Matches[1]) | Out-Null
        }
    }
    $partNumber | sort -Unique | Out-File -FilePath .\foo.txt
    
  • #34491
    PB

    Participant
    Points: 0
    Rank: Member

    This is great – thank you both (Bob McCoy and random commandline)

    @Bob McCoy
    The bookingsPattern is not working and is maybe because of the file names. I had said random text but they are date and time but written differently than the same recorded on PN*.txt files.

    Here is an example:

    bookings_SKU56780-13_11-29-2011_01-43-47.txt

    Does the $bookingsPattern = "bookings_(.+)_" need to change to match this?

    Thanks again.

  • #34492
    PB

    Participant
    Points: 0
    Rank: Member

    @random commandline
    # Get Files section works as expected

    # Match part numbers and add to collection does not work as expected – and I have made the change you requested. The output for $partnumbers is:

    ÿþ

    # Export part numbers works as expected

  • #34510

    Participant
    Points: 0
    Rank: Member

    Ok, I modified my script. I forgot an underscore after the date. It should work for you now.

  • #34517

    Participant
    Points: 0
    Rank: Member

    In my script make the following change.

    $bookingsPattern = "bookings_(.+?)_"
    
  • #34537
    PB

    Participant
    Points: 0
    Rank: Member

    Thank you both!

The topic ‘Recording File Names’ is closed to new replies.