Batch Processing a pile of Word Documents

This topic contains 2 replies, has 3 voices, and was last updated by Profile photo of Ron Ron 3 months, 3 weeks ago.

  • Author
    Posts
  • #63466
    Profile photo of mike namib
    mike namib
    Participant

    Hey Folks,

    I've been looking into a problem of a friend of mine. He has a group people filling aut a form (word / doc /docx ) and these are in a folder. So we need to automate the processing hundreds documents and grab specific parameters and write the into a ".csv"-file.

    So, I can read the files and stack them but fishing the specified string, that`s where it starts to get tuff.

    Can anybody help out here?

    
    $docPath = $args[0]
    Write-Host "Processing Documents from:" $docPath 
    $all_docs = Get-ChildItem $docPath -filter "*.docx"
    
    $word = New-Object -comobject "Word.Application"
    $word.Visible = $False
    
    # Now, open each document and list the "ContentControls"
    $all_items = @()
    foreach ( $doc in $all_docs)
    {
      Write-Host "Processing :" $doc.FullName
      $doc = $word.Documents.Open($doc.FullName);
    	
      $controls = $doc.ContentControls.Count	
      Write-Host "Found : " $controls.Count " Content Controls"
    
      # Now, we create a collection of custom objects which are holding the data
      $item = New-Object System.Object
      foreach ( $control in $doc.ContentControls )
      {
        $item | Add-Member -type NoteProperty -name $control.Title -value $control.Range.Text
      }
      $item
      $all_items += $item
      $doc.Close()
    }
    
    # Last, we save the collection to a CSV file
    $all_items | Export-CSV "exportData_DATE.CSV"
    
    
  • #63501
    Profile photo of Don Jones
    Don Jones
    Keymaster

    Yeah, Word is about the worst-case scenario for this, unfortunately, and you're going to be stuck with a decade-old COM object to work with. And this isn't _really_ PowerShell; it's COM programming against Word. I say this only because, historically, we've have very low turnout on this type of question, and I didn't want you hanging around waiting for an answer that might not be forthcoming. Sorry :(.

    But, if you figure it out, maybe you'll consider dropping by now and again to answer these kinds of questions when they come up :).

  • #63507
    Profile photo of Ron
    Ron
    Participant

    It looks like he may have already figured out the COM part, but doesn't know how to get the data being returned in the format he wants. It might help to fill out a form with test data, show us the output, and then show us what you want it to be. This might actually be more of a Regex question.

You must be logged in to reply to this topic.