regex - get only uppercase

Welcome Forums General PowerShell Q&A regex - get only uppercase

Viewing 3 reply threads
  • Author
    Posts
    • #183024
      Participant
      Topics: 16
      Replies: 11
      Points: 64
      Rank: Member

      I have text file called old.txt with many records .

      the old.txt looks like  :

      "R4E3W68P0"
      "g4y2W3ls0"

      I have web.config files that contains something like  :

      clientCertificate ="R4E3W68P0" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"

      clientCertificate ="g4y2W3ls0" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"

       

      I 'm trying to get only the thumbs with uppercase .

      the code is :

      $old = gc c:\old.txt

      foreach ($o in $old) {

      for ($i = 0 ; $i -lt $new.count ; $i++) {

      gc c:\web.config | Select-String $old[$i] |Where { $_ -cmatch "\b[A-Z0-9_]+\b" }

      }

      }

      but I get both lowercase & uppercase .

      I also tried with  "\b[A-Z]+\b"

      if I try to run the same code  against web.config with only the uppercase thumbprint  , its working .

       

       

       

    • #183045
      Participant
      Topics: 10
      Replies: 117
      Points: 456
      Helping Hand
      Rank: Contributor

      You don't need to double-up on loops, and your Select-String did not reference $o within the loop body.

      Try this:

      $old       = Get-Content -Path C:\old.txt
      $webConfig = Get-Content -Path C:\web.config
      
      foreach ($o in $old) {
          $webConfig |
              Select-String -Pattern $o |
              Where-Object -FilterScript { $_ -cmatch "\b[A-Z0-9_]+\b" }
      }

      And a kind reminder to use the Preformatted format when posting code.

    • #183132
      Participant
      Topics: 16
      Replies: 11
      Points: 64
      Rank: Member
      1. I think my problem is because the whitespace. I need to run gci with select-string  +  regex that will check if the string is upper  or lower case , regardless the whitespaces. F8 P3 W1 / F8P3W1 needs to be identified the same , as uppercase string ,   and the same for lowercase .
    • #183858
      Participant
      Topics: 3
      Replies: 68
      Points: 367
      Helping Hand
      Rank: Contributor

      I can't reproduce your results exactly, but there are some problems with how your script is written. Here's the test script I ran:

      $old = @(
          "R4E3W68P0"
          "g4y2W3ls0"
          "a1b2c3d4e"
          "F8 P3 W1"
          "f9 P2 w4"
      )
      $webConfig = @(
          'clientCertificate ="R4E3W68P0" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"'
          'clientCertificate ="g4y2W3ls0" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"'
          'clientCertificate ="a1b2c3d4e" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"'
          'clientCertificate ="F8 P3 W1" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"'
          'clientCertificate ="f9 P2 w4" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"'
      )
      
      foreach ($o in $old) {
          for ($i = 0 ; $i -lt $webConfig.Count ; $i++) {
              $webConfig | Select-String $old[$i] | Where { $_ -cmatch "\b[A-Z0-9_]+\b" }
          }
      }

      The results look like this:

      clientCertificate ="R4E3W68P0" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"
      clientCertificate ="F8 P3 W1" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"
      clientCertificate ="f9 P2 w4" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"
      clientCertificate ="R4E3W68P0" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"
      clientCertificate ="F8 P3 W1" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"
      clientCertificate ="f9 P2 w4" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"
      clientCertificate ="R4E3W68P0" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"
      clientCertificate ="F8 P3 W1" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"
      clientCertificate ="f9 P2 w4" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"
      clientCertificate ="R4E3W68P0" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"
      clientCertificate ="F8 P3 W1" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"
      clientCertificate ="f9 P2 w4" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"
      clientCertificate ="R4E3W68P0" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"
      clientCertificate ="F8 P3 W1" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"
      clientCertificate ="f9 P2 w4" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"

      The case match worked as intended, matching strings with only uppercase alpha characters.
      It matched R4E3W68P0 because this string contains only uppercase alpha characters and digits between the word borders (as desired).
      It did not match g4y2W3ls0 because this string contains lowercase alpha characters between the word borders (as desired).
      It did not match a1b2c3d4e because this string contains lowercase alpha characters between the word borders (as desired).
      It matched F8 P3 W1 because this is actually three strings that each satisfy the regex condition. The spaces are treated as word border characters and therefore match the \b, so this appears to operate as desired but is actually overmatching.
      It matched f9 P2 w4 because P2 satisfies the regex condition by itself. This could appear to be an un-desired match of lowercase alpha characters, but it is in fact matching only the P2 string.

      Also, because of the nested for loops, it cycled through the input multiple times producing excessive results. If we simplify the code as Aaron recommends, it will cycle through the input only once.

      We can fix the problems by modifying the script with Aaron's version, and adjusting the regex, like this:

      foreach ($o in $old) {
          $webConfig |
              Select-String -Pattern $o |
              Where-Object -FilterScript { $_ -cmatch '\"[A-Z0-9_\s]+\"' }
      }

      Using the same input as in the first test, this gives the following results:

      clientCertificate ="R4E3W68P0" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"
      clientCertificate ="F8 P3 W1" x509FindType="FindByThumbprint" storeLocation="LocalMachine" storeName="My"

      Instead of matching word borders with \b, we are now matching the quotation marks surrounding the string we want to check by using escaped double quotes (\"). We have also added \s to the bracket set to account for the spaces in the string.
      It matched "R4E3W68P0" because this string contains only uppercase alpha characters and digits between the double quotes (as desired).
      It did not match "g4y2W3ls0" because this string contains lowercase alpha characters between the double quotes (as desired).
      It did not match "a1b2c3d4e" because this string contains lowercase alpha characters between the double quotes (as desired).
      It matched "F8 P3 W1" because this string contains only uppercase alpha characters, digits and spaces between the double quotes (as desired).
      It did not match "f9 P2 w4" because this string contains lowercase alpha characters between the double quotes (as desired).

      Note that this depends on the string always being between double quotes in the input file. It also depends on the other strings that are between double quotes not having only uppercase alpha characters or anything else that matches the bracket set. If "LocalMachine" were instead written as "LOCALMACHINE", it would match (undesired).

      If necessary, we can account for this by adding a lookbehind to the regex:

      foreach ($o in $old) {
          $webConfig |
              Select-String -Pattern $o |
              Where-Object -FilterScript { $_ -cmatch '(?< =clientCertificate =)\"[A-Z0-9_\s]+\"' }
      }

      This will result in only checking the string between double quotes that follows clientCertificate =. Everything else will not match, regardless of its contents.

Viewing 3 reply threads
  • You must be logged in to reply to this topic.