How do I "save as" with a new file extension?

This topic contains 0 replies, has 1 voice, and was last updated by  Forums Archives 5 years, 7 months ago.

  • Author
    Posts
  • #5432

    by dwwilson66 at 2013-01-15 05:12:03

    I'm running this very basic script. My intent is to take ~3000 HTML files and convert them to Word documents. While the "saveas" cmdlet works splendidly, the result is a number of documents formatted as MSWord documents (verfied via Hex editor), but the file EXTENSION remains *.htm. How can I incorporate the corrected file extension into the saveas-document function?

    $docPath = "c:\users\xxxxxxxx\desktop\dtest"
    $htmPath = "c:\users\xxxxxxxx\desktop\htest"

    $srcfiles = Get-ChildItem $htmPath -filter "*.htm*"

    $saveFormatDoc = [Enum]::Parse([Microsoft.Office.Interop.Word.WdSaveFormat], "wdFormatDocument");

    $global:word = new-object -comobject word.application

    $word.Visible = $False

    function saveas-document ($docs) {
    $opendoc = $word.documents.open($docs) ;
    $savepath = $docs -replace [regex]::escape($htmPath),"$docPath"
    $opendoc.saveas([ref]"$savepath", [ref]$saveFormatDoc);
    $opendoc.close();
    }

    ForEach ($doc in $srcfiles) {
    Write-Host "Processing :" $doc.FullName
    saveas-document -docs $doc.FullName
    $doc = $null
    }
    $word.quit();

    by ArtB0514 at 2013-01-15 06:01:13

    I haven't tried this, but would guess that you need to drop the ".htm" from $savepath. See if this helps:

    $SavePath = "$docPath\$($Docs.BaseName)"

    by dwwilson66 at 2013-01-15 06:12:27

    [quote="ArtB0514"]I haven't tried this, but would guess that you need to drop the ".htm" from $savepath. See if this helps:

    $SavePath = "$docPath\$($Docs.BaseName)"[/quote]

    Substituting that for my current declaration of $savepath results in the following:

    Exception calling "SaveAs" with "16" argument(s): "This is not a valid file name.
    Try one or more of the following:
    * Check the path to make sure it was typed correctly.
    * Select a file from the list of files and folders."
    At C:\users\x46332\desktop\RNITEST2.PS1:73 char:20
    + $opendoc.saveas < <<< ([ref]"$savepath", [ref]$saveFormatDoc);
    + CategoryInfo : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : DotNetMethodException

    by ArtB0514 at 2013-01-15 06:45:41

    Wow! 16 arguments to SaveAs!! Something is clearly wrong with at least one element of $savePath. You did check too see what $SavePath resolves to, right? What is it?

    by dwwilson66 at 2013-01-15 06:56:04

    With your modification, based on my code, I would GUESS it resolves to the following. However, being relatively new to PowerShell, perhaps I'm interpreting it incorrecly...what else am I missing?

    $savepath = "$docPath\$($docs.BaseName)" resolves to :
    $savepath = [c:\users\x46332\desktop\dtest]\$(for each $doc in...)[Get-ChildItem [c:\users\x46332\desktop\htest] -filter "*.htm*"]
    [ what's declard for the variable ]
    ( pseudocode )
    [quote="ArtB0514"]Wow! 16 arguments to SaveAs!! Something is clearly wrong with at least one element of $savePath. You did check too see what $SavePath resolves to, right? What is it?[/quote]

    by ArtB0514 at 2013-01-15 10:11:39

    $docpath should resolve to "c:\users\xxxxxxxx\desktop\dtest"
    $docs.BaseName should resolve to "filename", which is the filename minus the folder path and minus the extension.
    So, $savePath should be "c:\users\xxxxxxxx\desktop\dtest\filename"
    I see where I went wrong. Try it this way, passing the complete html document property to saveas-document and selecting appropriate properties there:

    function saveas-document ($docs) {
    "Opening Document $($docs.FullName)"
    $opendoc = $word.documents.open($docs.FullName)
    $savepath = "$docPath\$($docs.BaseName)"
    "Saving $($Docs.Name) to $savepath"
    $opendoc.saveas($savepath, $saveFormatDoc)
    $opendoc.close()
    Get-ChildItem $savepath
    }

    ForEach ($doc in $srcfiles) {saveas-document $doc}

    by dwwilson66 at 2013-01-15 10:41:56

    Holy cow. It works. Thanks. This is awesome.

    A couple questions, since I don't really understand what you did here.
    1.) in your line 5, "Saving $($Docs.Name) to $savepath" what does the $($Docs.Name) mean? is the first $ an escape character to signify "a variable follows" so we don't output $Docs.Name as a literal string? but then why do we need it in the second part of Line 4 and not at the beginning? I'm not quite sure what's going on.
    2.) your line 8, Get-ChildItem $save-path errored out...the path C:\foo\bar\DocWithNoExtension doesn't HAVE children. I took out that line and everything apperars to be working fine. Does that make sense? or should something else have happened with that line?
    3.) even though I'm saving what APPEARS to be BaseName, .doc is appended to the filename. Is this because the save-as cmdlet automatically appends the appropriate extension based on file type? Or am I missing a line of code somewhere that assigns the extension to the base name?

    Thanks,
    d

    by nohandle at 2013-01-15 11:17:46

    Hi, hope ArtB0514 won't mind me replying.
    1) the $() is subespression, I describe it usage here http://powershell.cz/2012/12/19/subexpressions/ . You are on the right path -> the subexpression is needed to execute the whole command not just expand the variable name. More in the article.
    2) The last line should probably output the newly created item to the pipeline, but I think it should use get-item and add the extension on the end. get-item ($savepath + $extension) if the extension is set to the correct string (like ".doc"). There is probably better way to implement the last line, does the saveAs produce any output?
    3) yes. it appends it automatically.

    by dwwilson66 at 2013-01-15 12:25:47

    [quote="nohandle"]Hi, hope ArtB0514 won't mind me replying.
    1) the $() is subespression, I describe it usage here http://powershell.cz/2012/12/19/subexpressions/ . You are on the right path -> the subexpression is needed to execute the whole command not just expand the variable name. More in the article.[/quote]

    Thanks for the awesome resource. I'll be reading a lot on this website over the coming days.

    [quote="nohandle"]2) The last line should probably output the newly created item to the pipeline, but I think it should use get-item and add the extension on the end. get-item ($savepath + $extension) if the extension is set to the correct string (like ".doc"). There is probably better way to implement the last line, does the saveAs produce any output?[/quote]

    The script produces output and runs just fine without that last line. That's why I'm trying to understand what it's SUPPOSED to do. It seems completely unnecessary to me. But again, being a newbie, I may be missing something.

    by nohandle at 2013-01-15 12:41:31

    1) glad you like it 🙂 If you'd like something covered let me know.
    2) I think the purpose was to output the file object to the output to enable you to process the files further. The saveAs probably do not output proper file objects. But if you don't need it the output just delete the line.

    by ArtB0514 at 2013-01-15 13:21:43

    My script variation was designed for debugging. That's why there were several strings (listing some input variable and output variable values). The last line was part of the debugging, but because I didn't actually run the script, I made a mistake. The correct last line is:
    Get-ChildItem "$savepath.*"
    The intended result was for the function to return the FileInfo object that was created.

    by dwwilson66 at 2013-01-16 07:25:43

    Aha, makes sense. Now I see...

    thanks for the update.

You must be logged in to reply to this topic.