regex and checking if numbers are encapsulated

This topic contains 8 replies, has 3 voices, and was last updated by  iain Barnetson 3 days, 8 hours ago.

  • Author
    Posts
  • #84275

    iain Barnetson
    Participant

    I've a PS script that reads Helpdesk tickets and in those for correcting people SSNs, the BAs post the tsql query they want to be executed. But, some dont encapsulate the SSNs in quotations, so we get:

    UPDATE ssntable 
    SET ssn1 = 123456783 
    WHERE ssn1 = 876543214
    

    When we should be getting:

    UPDATE ssntable 
    SET ssn1 = '123456783'
    WHERE ssn1 = '876543214'
    

    On occasion we also get:

    SELECT firstname, lastname FROM table WHERE ssn in (123456786,876543217,321654784,345676873)
    

    When we should get

    SELECT firstname, lastname FROM table WHERE ssn in ('123456786','876543217','321654784','345676873')
    

    I've been trying to figure out a regex expression to check if the numerals are encapsulated in quotations and if not add them. I'm totally lost and hoping for some guidance and help.

  • #84286

    nohwnd
    Participant

    One way to approach this problem would matching every number, and it's surrounding quotes, and replacing them with the same number but quoted.

    1       -> '1'
    '1      -> '1'
    1'      -> '1'
    '12345' -> '12345'
    

    To do that you can use the -replace operator, where you specify the pattern to match, and if you use sub-matches you can put the sub matches back into the result by using $. In our case that is $1 to put back the first sub-match – the original number.

    Keep in mind that the $ in this case does NOT mean powershell variable, instead it is a token that the regex engine processes, so you must make sure that the $ remains in the text, that is why I am escaping it by a back-tick. Normally I would use single quotes but we use them in the replacement string so escaping by back-tick is easier.

    Since you mentioned you are lost, I assume you tried many things and nothing completely worked for you. So I suggest using Pester tests to slowly progress towards the solution. Below I created two tests, the first one passes and shows that the examples I mentioned at the start of this post in fact work. The last test does not pass because we are replacing any number with it's quotes version, and that means we end up with ssn'1' =, which is incorrect. It is up to you to slowly add more complex examples and modify the Add-Quoute function to pass all of them. Good luck!

    function Add-Quote ($Value) {
        
        foreach ($l in $Value) {
            $l -replace "'?(\d+)'?", "'`$1'"
        }
    }
    
    Describe "Put numbers in single quotes" {
        It "Given just numbers  it encapsulates them in single quotes" -TestCase @(
            @{Data="1"; Expected="'1'"}
            @{Data="'1"; Expected="'1'"}
            @{Data="1'"; Expected="'1'"}
            @{Data="12345"; Expected="'12345'"}
        ) {
            param($Data, $Expected)
            # -- Arrange
            
            # -- Act
            $actual = Add-Quote $Data
    
            # -- Assert
            $actual | Should -Be $Expected
        }
    
        # [ .. more tests would go here .. ] 
    
        It "Encapsulates numbers on multiple lines" {
            # -- Arrange  
            $data = "UPDATE ssntable`nSET ssn1 = 123456783`nWHERE ssn1 = 876543214"
            $expected = "UPDATE ssntable`nSET ssn1 = '123456783'`nWHERE ssn1 = '876543214'"
    
            # -- Act 
            $actual = Add-Quote $data
    
            # -- Assert
            $actual | Should -Be $expected
        }
    }
    
    • #84388

      iain Barnetson
      Participant

      Thanks Mr nohwnd, I'll give that a try out.
      I've read about Pester but never used it yet, will get that setup and try it out too.
      Thanks.

  • #84407

    Olaf Soyk
    Participant

    Actually there are regular expressions able to do what you ask for:

    $String = "SELECT firstname, lastname FROM table WHERE ssn in ('123456786,876543217','321654784',345676873)"
    $String -replace "(?< !'|\d)\d","'" -replace "\d(?!'|\d)","'"

    All numbers are surrounded by single quotes no matter if they had before a single quote before them after them or none at all.

    • #84415

      nohwnd
      Participant

      Now you spoiled the fun of figuring it out himself :))

    • #84427

      iain Barnetson
      Participant

      Thank you!

    • #84440

      iain Barnetson
      Participant

      Sadly that doesn't work as expected, it's trimming numbers off where it adds a '

      $String = "SELECT firstname, lastname FROM table WHERE ssn in ('123456786,876543217','321654784',345676873)"
      $String -replace "(?< !'|\d)\d","'" -replace "\d(?!'|\d)","'"
      

      results in this:

      SELECT firstname, lastname FROM table WHERE ssn in ('12345678','76543217','321654784','4567687')
      
  • #84512

    Olaf Soyk
    Participant

    Hmmm ... you're right ... I didn't notice that before ... I'm sorry ... obviously the -replace operator does not "like" the negative lookahead or lookbehind. So we need another approach. Try this:

    $String = "SELECT firstname, lastname FROM table WHERE ssn in ('123456786,876543217','321654784',345676873)"
    $String -replace "'(\d+)",'$1' -replace "(\d+)'",'$1' -replace "'(\d+)'",'$1' -replace '(\d+)','''$1'''
    • #84518

      iain Barnetson
      Participant

      Olaf, That's done it nicely.
      Much appreciated, I've been trying to read up on this stuff today and got no where close.

      PS C:\Users\svn_admin> $String = "SELECT firstname, lastname FROM table WHERE ssn in ('123456786,876543217','321654784',345676873)"
      $String -replace "'(\d+)",'$1' -replace "(\d+)'",'$1' -replace "'(\d+)'",'$1' -replace '(\d+)','''$1'''
      SELECT firstname, lastname FROM table WHERE ssn in ('123456786','876543217','321654784','345676873')
      

      ...and another try

      PS C:\Users\svn_admin> $String = "SELECT firstname, lastname FROM table WHERE ssn = 123456786"
      $String -replace "'(\d+)",'$1' -replace "(\d+)'",'$1' -replace "'(\d+)'",'$1' -replace '(\d+)','''$1'''
      SELECT firstname, lastname FROM table WHERE ssn = '123456786'
      

You must be logged in to reply to this topic.