Regex and Various Matches

This topic contains 5 replies, has 4 voices, and was last updated by  Aaron Hardy 8 months, 2 weeks ago.

  • Author
    Posts
  • #63280

    Aaron Hardy
    Participant

    I am retrieving a single string stored in a text file in a format matching below, and then find all files matching the string in a specified directory.

    String possibilities in file (n = any number):
    17-0201_n
    17-0201-cn_n
    17-0201M_n
    17-0201M-cn_n
    17-0201E-cn_n

    17-0201... is just an example. It could also be 15-0923, 19-1215M, and so on.

    17-0201 should match only 17-0201, not 17-0201M or 17-0201E.

    I'm assuming using a regex would be best to match files to the string in the file, so that is where I am stuck. If not, kindly direct me to a better solution.

    Thanks for your input.

  • #63285

    Rob Simmers
    Participant

    Is this what you are looking for?

    $values = "17-0201_n",
              "17-0201-cn_n",
              "17-0201M_n",
              "17-0201M-cn_n",
              "17-0201E-cn_n",
              "15-0923",
              "19-1215M"
    
    
    $matched = foreach ($value in $values) {
        Write-Host ("Found match {0} for value {1}" -f [regex]::Match($value, "\d{2}-\d{4}").groups[0].Value, $value)
        [regex]::Match($value, "\d{2}-\d{4}").groups[0].Value
    }
    
    $matched | Select -Unique
    

    Output:

    Found match 17-0201 for value 17-0201_n
    Found match 17-0201 for value 17-0201-cn_n
    Found match 17-0201 for value 17-0201M_n
    Found match 17-0201 for value 17-0201M-cn_n
    Found match 17-0201 for value 17-0201E-cn_n
    Found match 15-0923 for value 15-0923
    Found match 19-1215 for value 19-1215M
    
    17-0201
    15-0923
    19-1215
    

    You could update the pattern to "^\d{2}-\d{4}$" if you only expect a match for 15-0923

  • #63289

    Aaron Hardy
    Participant

    "You could update the pattern to "^\d{2}-\d{4}$" if you only expect a match for 15-0923"

    Thanks, Rob. Your example accomplishes part of what I need, which I already have something similar working. My problem is that I am trying to account for an additional character after the last 4 digits which may or may not exist, and same with "-cn".

    Examples: 17-0201, 17-0201-cn, 17-0201M-cn, and so on.

    So, take:

    [string] = '17-0201'
    

    Then use a regex to see if any file name(s) in folder X matching the string.

    This should return matching files:

    17-0201_0
    17-0201-cn_0
    17-0201-cn_1
    17-0201-cn_2

    (the _0, _1, etc. is used in the file name to prevent original files from being overwritten)

    But not return:
    17-0201M_0
    17-0201M-cn_0
    17-0201E-cn_0

    Hope this makes more sense.

    Thanks again.

    • #63322

      Ron
      Participant

      You'll need to define the character(s) after the string that either should or should not be there to be considered a match. For instance are "-" or "_" the only characters that can be there and be considered a match on the base pattern. Or, could it be any non letter (or number). What about white space? Etc.

  • #63334

    random commandline
    Participant
    $string = '17-0201_n
     17-0201_0
    17-0201-cn_0
    17-0201M_0
    17-0201M-cn_0
     17-0201
     17-0201 spacehere
     17-0201-BE
     17-0201_1' -split "`n"
    
    # Match 2 numbers, dash, and 4 numbers ONLY -OR-
    # Match Non word character plus underscore                                                        
    foreach ($s in $string){
        [regex]::Matches($s,'\d{2}-\d{4}$|\d{2}-\d{4}[\W_].*').value
    }
    # Results:
    # 17-0201_n
    # 17-0201_0
    # 17-0201-cn_0
    # 17-0201
    # 17-0201 spacehere
    # 17-0201-BE
    # 17-0201_1
    
    
    
  • #63583

    Aaron Hardy
    Participant

    Thanks everyone for the input.

    Shortly after my last response, I did come up with the (simple) solution I needed but haven't been able to follow up until now. So here is the regex I came up with (for anyone else that might need it down the road):

    [string]$TimeCodeEmpty = "(\d{2}-\d{4}$)"
    [string]$TimeCodeM = "(\d{2}-\d{4}M$)"
    [string]$TimeCodeA = "(\d{2}-\d{4}A$)"
    [string]$TimeCodeE = "(\d{2}-\d{4}E$)"
    [string]$regex_TimeCode = "$TimeCodeEmpty|$TimeCodeM|$TimeCodeA|$TimeCodeE"

    Pretty simple, eh? Sometimes you just need to pull away and not overlook things.

    Thanks again!

You must be logged in to reply to this topic.