regex

This topic contains 8 replies, has 5 voices, and was last updated by

 
Participant
3 weeks, 6 days ago.

  • Author
    Posts
  • #145814

    Participant
    Points: 72
    Rank: Member

    hi all,

    I'm new to regex and I'm trying a simple one to see how they work.

    My simple regex is "(\w+\@\w+\w+)" to be able to capture some email from a big text.

    I've created a text file, put into it a big chunk of random text, and inserted an email at 2 spots in there.

    When I compare the file's content to my regex, -match shows me the 2 lines with the email but what I see is the whole line, not the email I was looking for.

    If I do $matches, I see no match.

    Can someone please explain what I am doing wrong?

    $file = get-content file.txt

    $regex = "(\w+\@\w+\w+)"

    $file -match $regex

    Thank you!

  • #145823
    js

    Participant
    Points: 956
    Helping Hand
    Rank: Major Contributor

    I think you want to go through the file one line at a time. -match works differently with an array.

    foreach ($line in $file) {
      $line -match $regex
      $matches
    }
    
    True
    
    Name                           Value
    ----                           -----
    1                              js@powershell
    0                              js@powershell
    
  • #145844

    Participant
    Points: 328
    Helping Hand
    Rank: Contributor

    I would recommend using the [regex] accelerator. It allows the user to tap directly into .NET Regex, its faster and more efficient than using native PS to loop through data.

    $text = Get-Content -Path C:\TEMP\PSdotorg_deleteMe.txt
    $regex = "(\w+\@\w+\w+)"
    
    [regex]::Matches($text,$regex)
    
    Groups   : {0, 1}
    Success  : True
    Name     : 0
    Captures : {0}
    Index    : 0
    Length   : 10
    Value    : this@email
    
    Groups   : {0, 1}
    Success  : True
    Name     : 0
    Captures : {0}
    Index    : 133
    Length   : 13
    Value    : another@email
    
  • #145847

    Participant
    Points: 328
    Helping Hand
    Rank: Contributor

    Another recommendation is to use RegEx101.com to double check your syntax. I personally find it easier to test with than running through PS to get it right, but to each their own.

    $regex = "(\w+@\w+\.\w+)"
    

    That regex will return the entire email address, including the .xyz

    • #145856
      js

      Participant
      Points: 956
      Helping Hand
      Rank: Major Contributor

      Even that pattern won't work on emails with more than one '.'.

      PS C:\users\js> 'js@www.powershell.org' -match "(\w+@\w+\.\w+)"; $matches
      True
      
      Name                           Value
      ----                           -----
      1                              js@www.powershell
      0                              js@www.powershell
      
    • #145878

      Participant
      Points: 328
      Helping Hand
      Rank: Contributor

      You are correct, unconventional formatting in the domain hadn't occurred to me! The below expression accounts for that, provided of course there are no digits in use before the @. 🙂

      $regex = '\w+@.+?(?=\s)'
      
      [regex]::Matches($text,$regex)
      
      Groups   : {0}
      Success  : True
      Name     : 0
      Captures : {0}
      Index    : 0
      Length   : 17
      Value    : email@address.com
      
      Groups   : {0}
      Success  : True
      Name     : 0
      Captures : {0}
      Index    : 150
      Length   : 27
      Value    : another@email.different.com
      
      Groups   : {0}
      Success  : True
      Name     : 0
      Captures : {0}
      Index    : 222
      Length   : 47
      Value    : whatTheHeck@www.why.onearth.areyoudoingthis.com
      
  • #145886

    Participant
    Points: 1,368
    Helping Hand
    Rank: Community Hero

    An email address might be the wrong choice for a regex beginner ... even when most of the people think that an email address is easy to recognize.

    This site illustrates a little bit of what I mean: http://emailregex.com/.

  • #146022

    Participant
    Points: 72
    Rank: Member

    ok I will experiment with what you guys gave me and I believe I'll be good.

    Thank you!

  • #146247

    Participant
    Points: 1,316
    Helping Hand
    Rank: Community Hero

    Messing with text with email strings.

    $UrlList = @'
    this is the URL https://stackoverflow.com/&20%
    http://stackoverflow.com
    http://www.SomeSite.com this is oure main site
    http://www.SomeSite.com
    ftp://www.somesite.com
    ftp://somesite.com
    ftp\SomeSite.com
    If you want the file go there: file://SomeSite.com
    '@ 
    
    [RegEx]::Matches($UrlList, '(ftp:|ftp|http:|https:|file:)(//.([^\s]+)|\\.([^\s]+))').value
    
    https://stackoverflow.com/&20%
    http://stackoverflow.com
    http://www.SomeSite.com
    http://www.SomeSite.com
    ftp://www.somesite.com
    ftp://somesite.com
    ftp\SomeSite.com
    file://SomeSite.com
    
    

You must be logged in to reply to this topic.

denizli escort samsun escort muğla escort ataşehir escort kuşadası escort