Text blocks and long regex strings

This topic contains 1 reply, has 2 voices, and was last updated by  Dave Wyatt 1 year, 7 months ago.

  • Author
    Posts
  • #37779

    marksrasmussen
    Participant

    I want this a regex to be format over several lines for clarity. Like this (the forum format strips all of the assignment tags or marks the whole post as spam?):

    $regexDatum = [regex] @"
    ^\< (?[0-9A-Fa-f][0-9A-Fa-f]),
    (?\w\w),
    (?\d+),
    (?\d+),
    (?\d+),
    (?-{0,1}\d+),
    (?\d+),
    (?\d+),
    (?-{0,1}\d+),
    (?\d+),
    (?\d+),
    (?\d+),
    

    But I find that the whitespace at the end of the line cause problems and the backtick seems to compound the problem. Hence I'm forced to put in all on one line – hello column 240 ...

    Most c-compilers would allow me to something like this (slashes not corrected for c-compiler)

    char[] regexDatum = "^\< (?[0-9A-Fa-f][0-9A-Fa-f]),"
                        "(?\w\w),"
                        "(?\d+),"
                        "(?\d+),"
                        "(?\d+),"
                        "(?\d+),"
                        "(?-{0,1}\d+),"
                        "(?\d+),"
                        "(?\d+),"
                        "(?-{0,1}\d+),"
                        "(?\d+),"
                        "(?\d+),"
                        "(?\d+),"
    

    Is there a way to terminate/continue the line within a text block?

    Bonus question:
    Are comments allowed within the regex pattern string?
    (e.g. Perl allows (?# my comments) or the /x modifier)

  • #37803

    Dave Wyatt
    Moderator

    Unless you explicitly say otherwise, whitespace (including newlines) become part of your regex pattern. The simplest way to tell it to ignore pattern whitespace is to begin the pattern with (?x):

    [regex] @"
    (?x)
    ^([0-9A-Fa-f][0-9A-Fa-f]),
    (\w\w),
    (\d+),
    (\d+)
    "@
    

    And yes, .NET also supports regex comments. In conjunction with the "ignore pattern whitespace" option, you can simply place an unescaped # sign into the pattern, and anything between that and the end of the line becomes a comment (just like PowerShell code, in fact.)

    [regex] @"
    (?x)                            # Ignore Pattern Whitespace
    ^([0-9A-Fa-f][0-9A-Fa-f]),      # Couple of hex digits
    (\w\w),                         # Couple of word characters
    (\d+),                          # Some digits
    (\d+)                           # Some digits
    "@
    
    

You must be logged in to reply to this topic.