This topic contains 3 replies, has 3 voices, and was last updated by
November 25, 2016 at 6:47 pm #58702ParticipantTopics: 1Replies: 1Points: 2Rank: Member
Does anyone know how to iterate over a collection of PDFs using Powershell, looking for documents with specific information stored in one of the info properties, e.g., info.subject?
As part of a document-workflow process I'm working on, I'm going to be storing values in some of the info properties, viz., Author, Subject, and Keywords. It would be very useful to later be able to find PDF documents that have specific information stored in those properties.
In Windows, if you look at the Properties for a PDF document, on the PDF tab you can see some of the info properties. Ideally, I'd be able to access those properties without having to actually open the PDF.
November 25, 2016 at 7:06 pm #58705ParticipantTopics: 1Replies: 1331Points: 1,680Rank: Community Hero
Out of the box ... without some specialized function or module or something like this Powershell does not know that much about pdf files. You can show what's available with :
Get-Item -Path -path to your pdf file-\-your pdf file-.pdf | Select-Object -Property *
November 25, 2016 at 7:23 pm #58706ParticipantTopics: 1Replies: 1Points: 2Rank: Member
Thanks for your reply.
Yes, the info properties are metadata, so they don't show up in the standard file properties.
I'm hoping someone has already cracked this nut. The Adobe Acrobat technical support community doesn't seem to have the breadth and depth I find for MSFT products.
November 27, 2016 at 11:45 am #58763ParticipantTopics: 0Replies: 1Points: 1Rank: Member
Check out this post showing how to use itextsharp.dll to parse PDF files:
The topic ‘powershell - iterate over a collection of PDFs looking for specific info.subject’ is closed to new replies.