ConvertFrom-PDF PowerShell Cmdlet | convert a c# .net program to powershell cmd

This topic contains 6 replies, has 4 voices, and was last updated by  Jørgen Guldmann 10 months, 1 week ago.

  • Author
  • #20720

    H Man

    Can anyone help me to convert a c# .net program to powershell cmdlet.

    Has anyone used this cmdlet ConvertFrom-PDF

    I need to scan PDF's and I cant seem to get the source code in this blog post into a working cmdlet

    Anyhelp would be greatly apprecated

  • #20725

    Dave Wyatt

    I've done some work with the iTextSharp libraries directly in PowerShell before. You can see an example at . You will need to download a copy of iTextSharp.dll.

  • #20739

    H Man

    Hi Dave thanks for getting back to me

    I tried Get-ReferencesFromPdf cmdlet It didn't return any data no errors either Any suggestions for troubleshooting this?

    I do have the iTextSharp.dll and created the same directory structure from the post

  • #20740

    Dave Wyatt

    That function was written specifically for the question posted on that thread, looking for section numbers followed by some number of lines matching ABC-*. It's not meant for you to be able to run it directly.

    However, the code does show you how to use the PdfReader and PdfTextExtractor classes to pull text out of a PDF into a .NET String variable. From there, you can split it by line as in the example, or just work with the whole page text as one string; that's up to you.

    Here's a more trimmed down example that just extracts all of the text from the PDF and outputs it as a single string, that you can manipulate however you want:

    function Get-PdfText
        param (
            [Parameter(Mandatory = $true)]
        $Path = $PSCmdlet.GetUnresolvedProviderPathFromPSPath($Path)
            $reader = New-Object iTextSharp.text.pdf.pdfreader -ArgumentList $Path
        $stringBuilder = New-Object System.Text.StringBuilder
        for ($page = 1; $page -le $reader.NumberOfPages; $page++)
            $text = [iTextSharp.text.pdf.parser.PdfTextExtractor]::GetTextFromPage($reader, $page)
            $null = $stringBuilder.AppendLine($text) 
        return $stringBuilder.ToString()
  • #20742

    H Man

    ok I tried it , still not returning a string am I still using

    Add-Type -Path .\PdfToText\itextsharp.dll

    I feel like im not placing this .dll right

    any other suggestions


  • #20752

    Tim Pringle

    Try something like this



    (Copying the file there as well of course)

    Use fully qualified paths BTW.

    Then test Dave's function. Worked good for me.

  • #77280

    Jørgen Guldmann

    For some reason I cannot load the dll as described..
    instead i have to do like this

    $bytes = [System.IO.File]::ReadAllBytes("c:\...\itextsharp.dll")

You must be logged in to reply to this topic.