Finding Hyperlinks in word documents then attaching the document

This topic contains 1 reply, has 2 voices, and was last updated by Profile photo of Rob Simmers Rob Simmers 2 years, 4 months ago.

  • Author
    Posts
  • #17282
    Profile photo of Aled Hughes
    Aled Hughes
    Participant

    Hey, Im creating a powershell script to search through word documents for hyperlinks. Here is what I would like to achive –

    Search through the word document

    Find the hyperlinks

    obtain the hyperlink address

    find the other word document from that hyperlink address

    Then attach that document to the original document

    Here is what I have so far : –

    $wdStory = 6
    
    $wdMove = 0
    
    $application = New-Object -ComObject word.application
    
    $root='S:\IT\Applications\Livelink\OIDocuments\Aled'
    
    $subfolder='\Docs'
    
    $path = "$root$subfolder"
    
    $ifile="$($root)\Part_6_Load_runindex.txt"
    
    $file="$($root)\debug_Part_6_Load.txt"
    
     
    
    Get-ChildItem $path -Recurse -include *.doc, *.docx |
    
        ForEach-Object{
    
            $document=$application.documents.open($_.fullname)
    
            write-host "Processing $($document.name)" -ForegroundColor green
    
            $objSelection = $application.Selection
    
            $a = $objSelection.EndKey($wdStory, $wdMove)
    
            $docprops=@{
    
                Name=$_.Name
    
                Path=$_.DirectoryName
    
               LinkText=$null
    
                LinkAddress=$null
    
                    }
    
            $document.Hyperlinks |
    
                ForEach-Object{
    
                    $docprops.LinkText=$_.TextToDisplay
    
                    $docprops.LinkAddress=$_.Address
    
                    New-Object PSObject -Property $docprops
    
                    $fpath = $document.path;
    
                    $newaddress= $fpath.SubString(0,($fpath.LastIndexOf('\')+1))
    
                    $faddress = $_.Address
    
                    add-content $file "newaddress: $($newaddress)";
    
                    if($faddress.startswith("..\"))
    
                    {
    
                        $newaddress = $newaddress+$faddress.replace("..\","")
    
                        write-host $newaddress;
    
                        add-content $file "newaddress (..\): $($newaddress)";
    
                        write-host "another end";
    
                    }
    
                    elseif($faddress.startswith("\\"))
    
                    {
    
                        $newaddress = $faddress;
    
                        write-host $newaddress;
    
                        add-content $file "newaddress (\\): $($newaddress)";
    
                         write-host "another end";
    
                    }
    
                    else
    
                    {
    
                            #$newaddress = +$root;
    
                            #$faddress.replace("../","/")
    
                            write-host $root
    
                            $faddress = $faddress.replace("../","\")
    
                            $newaddress = $newaddress + $faddress
    
                            write-host $newaddress
    
                            
    
                    }
    
                    #$objselection.InsertBreak(1);
    
                    #$insertdoc = $application.documents.open($newaddress)
    
     
    
                        }
    
               
    
                
    
            $document.Close()
    
            [void][System.Runtime.InteropServices.Marshal]::ReleaseComObject($document)
    
         } | Format-List
    
     
    
    [void]$application.quit()
    
    [void][System.Runtime.InteropServices.Marshal]::ReleaseComObject($application)
    

    I cannot figure out how to format the hyperlink address correctly in order for insertdoc to work correctly. Also there is something strange things going on with the loops. Overall I'm confused as I have never used powershell before. Any help out there

  • #17289
    Profile photo of Rob Simmers
    Rob Simmers
    Participant

    Working with Office products using COM+ is an adventure. The good thing is you can get a hint of what you're doing using the macro recorder. Enable the developer tab on the ribbon. Start recording a macro and try to do what you are doing manually in the document and stop the recording. Take a look at the Visual Basic for Applications (VBA) that is produced in the macro and convert it to Powershell. I was actually just documenting this for something else, so here is the example I did generating a table. Here is the VBA:

    ActiveDocument.Tables.Add Range:=Selection.Range, NumRows:=3, NumColumns:= _
            2, DefaultTableBehavior:=wdWord9TableBehavior, AutoFitBehavior:= _
            wdAutoFitFixed
        With Selection.Tables(1)
            If .Style  "Table Grid" Then
                .Style = "Table Grid"
            End If
            .ApplyStyleHeadingRows = True
            .ApplyStyleLastRow = False
            .ApplyStyleFirstColumn = True
            .ApplyStyleLastColumn = False
            .ApplyStyleRowBands = True
            .ApplyStyleColumnBands = False
        End With
        Selection.TypeText Text:="Write Something In Table"
    

    and here is the Powershell conversion:

    $wdWord9TableBehavior = 1
    $wdAutoFitContent = 1
     
    $Word = New-Object -Com Word.Application
    $Word.Visible = $true
    $document = $word.documents.add()
    $table = $Word.ActiveDocument.Tables.Add($Word.Selection.Range, 3, 2, $wdWord9TableBehavior, $wdAutoFitContent)
    $table.ApplyStyleHeadingRows = $True
    $table.ApplyStyleLastRow = $False
    $table.ApplyStyleFirstColumn = $True
    $table.ApplyStyleLastColumn = $False
    $table.ApplyStyleRowBands = $True
    $table.ApplyStyleColumnBands = $False
    $table.Cell(1,1).Range.Text = "Write Something in Table"

    The VBA will at least point you in the direction to search for for something like ".ActiveDocument.Tables.Add Powershell" to look for similar scripts. Note that VBA has built-in variables like wdWord9TableBehavior, but you'll have to define those variables in your code. When I was doing vbScript, it used to be Const wdWord9TableBehavior = 1, so a quick google of Const wdWord9TableBehavior usually finds the actual variable. Hopefully that will get a good base start for your research.

You must be logged in to reply to this topic.