Batch Processing a pile of Word Documents

Welcome Forums General PowerShell Q&A Batch Processing a pile of Word Documents

This topic contains 2 replies, has 3 voices, and was last updated by

Ron
 
Participant
1 year, 8 months ago.

  • Author
    Posts
  • #63466

    Participant
    Points: 0
    Rank: Member

    Hey Folks,

    I've been looking into a problem of a friend of mine. He has a group people filling aut a form (word / doc /docx ) and these are in a folder. So we need to automate the processing hundreds documents and grab specific parameters and write the into a ".csv"-file.

    So, I can read the files and stack them but fishing the specified string, that`s where it starts to get tuff.

    Can anybody help out here?

    
    $docPath = $args[0]
    Write-Host "Processing Documents from:" $docPath 
    $all_docs = Get-ChildItem $docPath -filter "*.docx"
    
    $word = New-Object -comobject "Word.Application"
    $word.Visible = $False
    
    # Now, open each document and list the "ContentControls"
    $all_items = @()
    foreach ( $doc in $all_docs)
    {
      Write-Host "Processing :" $doc.FullName
      $doc = $word.Documents.Open($doc.FullName);
    	
      $controls = $doc.ContentControls.Count	
      Write-Host "Found : " $controls.Count " Content Controls"
    
      # Now, we create a collection of custom objects which are holding the data
      $item = New-Object System.Object
      foreach ( $control in $doc.ContentControls )
      {
        $item | Add-Member -type NoteProperty -name $control.Title -value $control.Range.Text
      }
      $item
      $all_items += $item
      $doc.Close()
    }
    
    # Last, we save the collection to a CSV file
    $all_items | Export-CSV "exportData_DATE.CSV"
    
    
  • #63501

    Keymaster
    Points: 1,524
    Helping HandTeam Member
    Rank: Community Hero

    Yeah, Word is about the worst-case scenario for this, unfortunately, and you're going to be stuck with a decade-old COM object to work with. And this isn't _really_ PowerShell; it's COM programming against Word. I say this only because, historically, we've have very low turnout on this type of question, and I didn't want you hanging around waiting for an answer that might not be forthcoming. Sorry :(.

    But, if you figure it out, maybe you'll consider dropping by now and again to answer these kinds of questions when they come up :).

  • #63507
    Ron

    Participant
    Points: 0
    Rank: Member

    It looks like he may have already figured out the COM part, but doesn't know how to get the data being returned in the format he wants. It might help to fill out a form with test data, show us the output, and then show us what you want it to be. This might actually be more of a Regex question.

The topic ‘Batch Processing a pile of Word Documents’ is closed to new replies.