More RegEx confusion with telephone numbers

Tagged: 

This topic contains 2 replies, has 2 voices, and was last updated by Profile photo of Gerry McCafferty Gerry McCafferty 2 years ago.

  • Author
    Posts
  • #20647
    Profile photo of Gerry McCafferty
    Gerry McCafferty
    Participant

    Hi all,

    I have been asked to convert the UK telephones stored in AD in the E164 standard to the +44(0) XXX XXX XXX format by the business (which a lot of people do not like – not my decision).

    As I do not know what the telephone numbers will be, but know the pattern, I thought this would be perfect for RegEx however am again struggling to get it to work.

    The first issue is one of error checking.

    I have created this RegEX expression to detect both formats:

    #Example Format required by business
    $a = "+44(0)1202 204 746"
    #Example in E164 format
    $b = "+441202204746"
    #RegEx for $a
    $a -match "^\+[44]*[(0)]*\d{4}\s*\d{3}\s*\d{3}$"
    #RegEx for $b
    $b -match "^\+[44][0-9]*$"
    

    Unfortunately if I run the RegEx for $a on $b it will still come back true, so I cannot be sure that the number is in the correct format. How do I ensure the RegEx returns false if there are no spaces and (0) in the number?

    The second one is how to alter the number. I have tried the following two methods, but neither are working. What am I doing wrong?

    $a = "+441202204746"
    $a -replace "(^\+[44]*[(0)]*\d{4}\s*\d{3}\s*\d{3}$)", '$a'
    $a
    
    $a = "4412022047"
    $b = "{0:(###) ###-####}" -f $a
    $b
    

    Thanks in advance for any pointers you can give me 🙂

  • #20648
    Profile photo of Dave Wyatt
    Dave Wyatt
    Moderator

    In regex, square brackets create a [i]character class[/i], a set of characters that will be matched in a single position of the input string (unless you add a [i]quantifier[/i], such as + or * or {3} which tells it to match the character class a different number of times.)

    If you don't actually care what format the phone number was originally in, and just want to make sure it's matching the "+44(0)1202 204 746" format, then this should work with a single pattern:

    $numbers = @(
        '+44(0)1202 204 746'
        '+441203206748'
        '+4412345678901' # Deliberately has too many digits to test validation
    )
    
    $pattern = '^\+44(?:\(0\))?(\d{4})\s*(\d{3})\s*(\d{3})$'
    
    foreach ($number in $numbers)
    {
        if ($number -match $pattern)
        {
            '+44(0){0} {1} {2}' -f $matches[1], $matches[2], $matches[3]
        }
        else
        {
            Write-Error "Number '$number' is not a valid UK telephone number."
        }
    }
    
  • #20649
    Profile photo of Gerry McCafferty
    Gerry McCafferty
    Participant

    Hi Dave,

    Thanks for your reply. To make things easier, we are just going to check the telephone attribute currently in AD is in E164 format by using this RegEx (basically has a + and 12 digits with no spaces)

    $b -match "^\+?\d{12}$" 
    

    and before we write to a custom AD attribute, check to see if they are an exact match rather than using RegEx.

    The part I am still confused about is how the formatting works.

    I have had a look at http://msdn.microsoft.com/en-us/library/System.String.Format(v=vs.110).aspx to better understand how this works, but I am still confused and I cannot get your example to work in another example.

    If I understand correctly in the line

    '+44(0){0} {1} {2}' -f $matches[1], $matches[2], $matches[3] 
    

    you are telling the formatting command to put into matches[1],[2] and [3] the relevant parts of the number, but I am struggling to see how. Is this because of the RegEx in $pattern?

    Thanks again for all of your help!

You must be logged in to reply to this topic.