POWERSHELL HTML FILE MANIPULATION - REGEX

This topic contains 4 replies, has 2 voices, and was last updated by Profile photo of Amar Helloween Amar Helloween 3 months, 1 week ago.

  • Author
    Posts
  • #51474
    Profile photo of Amar Helloween
    Amar Helloween
    Participant

    Hi All,

    Below is a part of a html file.

    ******************************************************************************
    To: amar.helloween@email.com
    Subject: AG Sanity Status – Morning Sanity Check On AG Node 1: 08/22/16 06:25
    From: Amarnath.Mahato@cgi.com>
    Reply-to: Amarnath.mahato@cgi.com
    Content-Type: text/html; charset=us-ascii

    /
    ....some content
    

    ******************************************************************************

    My task is to remove the content starting from To:amar.helloween.... to us-ascii and only keep the

    ....

    content

    Kindly provide a REGEX to remove the above content.

    I tried this , but its not working.. even the date in the subject used to change daily. so help me with this.

    $regex1 = 'To: amar.helloween@email.com
    Subject: AG Sanity Status – Morning Sanity Check On AG Node 1: 08/22/16 06:25
    From: Amarnath.Mahato@cgi.com>
    Reply-to: Amarnath.mahato@cgi.com
    Content-Type: text/html; charset=us-ascii'

    $new_html = @()
    gc 'D:\Report.html' -raw |
    foreach {
    if ($_ -match $regex1)
    {
    $new_html += ($_ -replace $regex1,")
    $new_html | Out-File "D:\Report1.html"
    }
    else
    { "did not match"}
    }

  • #51480
    Profile photo of Amar Helloween
    Amar Helloween
    Participant

    Here is the link :

  • #51490
    Profile photo of Rob Simmers
    Rob Simmers
    Participant

    It appears you want to match what is between html tags versus excluding what you don't want:

    $test = @"
    blah
    blah
    blah
    
    
        
            Some HTML content
        
        
            blah blah blah
        
    
    
    "@
    
    #http://stackoverflow.com/questions/7167279/regex-select-all-text-between-tags
    $pattern = "(.|\n)*?"
    [regex]::matches($test,$pattern).Value
    

    Output:

    
        
            Some HTML content
        
        
            blah blah blah
        
    
    
  • #51499
    Profile photo of Rob Simmers
    Rob Simmers
    Participant

You must be logged in to reply to this topic.