Recording File Names

Tagged: , , ,

This topic contains 9 replies, has 3 voices, and was last updated by Profile photo of PB PB 10 months, 1 week ago.

  • Author
    Posts
  • #34470
    Profile photo of PB
    PB
    Participant

    I have thousands of unique strings that I need to capture from unstructured files.
    These files are stored on 9 x UNC shares. The unique string that I need to capture is actually stored in the file names on these share so this should be possible if you have the knowhow.
    The file names stored on these shares have only two naming formats::
    PN_DDMMYYYY_HHMMSS_PARTNUMBER.txt
    bookings_PARTNUMBER_randomtext.txt

    e.g. \\share1\customer\saleID\ Note that there can be another sub directory again underneath saleID. Each customer has their own directory, each saleID folder is unique.
    I need to capture only every unique PARTNUMBER from these file names on the 9 shares.
    There may be one or more files with the same part number but I only need to record it once i.e. remove duplicates – and save these out to a *.CSV file.
    Help appreciated as I have no clue where to start!

  • #34471
    Profile photo of Bob McCoy
    Bob McCoy
    Participant

    Do ALL the part numbers have a fixed length? Or does it vary?

    And is the part number strictly numeric? Or is it mixed?

  • #34472
    Profile photo of PB
    PB
    Participant

    Hi, thanks. Variable length, alpha numeric – but always in the same position of the file name examples given.

  • #34475
    Profile photo of random commandline
    random commandline
    Participant

    List your shares in the $shares variable and it will export a csv to user profile.

    # Get Files    
    $shares = "\\share1\","\\share2\"
    $files = Get-ChildItem -Path $shares -Recurse -Include PN_*.txt,bookings_*.txt
    
    # Match part numbers and add to collection
    $partnumbers = {@()}.Invoke()
    foreach ($file in $files){
    If ($file -match "(bookings_(?'pn'.*)_.*_|bookings_(?'pn'.*)_|PN_.*_(?'pn'.*).txt)") {$partnumbers.Add($Matches.pn)}
    } # End Foreach
    
    # Export part numbers
    $partnumbers | select -Unique | Out-File $env:USERPROFILE\partnumber.csv
    
  • #34476
    Profile photo of Bob McCoy
    Bob McCoy
    Participant

    In this sample I've simulated your converged filename list. I am assuming at this point you know how to get the list of files. I use RegEx to extract the part numbers from the file names per your example and information above. I collect all the part numbers in an array, then I sort it for unique numbers and dump it to a text file.

    $data = @"
    PN_31012016_112345_123456.txt
    bookings_23456_randomtext.txt
    PN_25062015_204500_345.txt
    bookings_6666A6666_randomtext.txt
    PN_02031999_153000_222Q1.txt
    bookings_911zzz_randomtext.txt
    PN_04071776_120000_345.txt
    bookings_456789_randomtext.txt
    "@ -split "`r`n"
    
    $pnPattern = "PN_\d{8}_\d{6}_(.+)\."
    $bookingsPattern = "bookings_(.+)_"
    $partNumber = New-Object -TypeName System.Collections.ArrayList
    
    foreach ($item in $data) {
        Write-Verbose "Working on $item"
        if ($item -match $pnPattern)
        {
            $partNumber.Add($Matches[1]) | Out-Null
        } elseif ($item -match $bookingsPattern)
        {
            $partNumber.Add($Matches[1]) | Out-Null
        }
    }
    $partNumber | sort -Unique | Out-File -FilePath .\foo.txt
    
  • #34491
    Profile photo of PB
    PB
    Participant

    This is great – thank you both (Bob McCoy and random commandline)

    @Bob McCoy
    The bookingsPattern is not working and is maybe because of the file names. I had said random text but they are date and time but written differently than the same recorded on PN*.txt files.

    Here is an example:

    bookings_SKU56780-13_11-29-2011_01-43-47.txt

    Does the $bookingsPattern = "bookings_(.+)_" need to change to match this?

    Thanks again.

  • #34492
    Profile photo of PB
    PB
    Participant

    @random commandline
    # Get Files section works as expected

    # Match part numbers and add to collection does not work as expected – and I have made the change you requested. The output for $partnumbers is:

    ÿþ

    # Export part numbers works as expected

  • #34510
    Profile photo of random commandline
    random commandline
    Participant

    Ok, I modified my script. I forgot an underscore after the date. It should work for you now.

  • #34517
    Profile photo of Bob McCoy
    Bob McCoy
    Participant

    In my script make the following change.

    $bookingsPattern = "bookings_(.+?)_"
    
  • #34537
    Profile photo of PB
    PB
    Participant

    Thank you both!

You must be logged in to reply to this topic.