The PDF plug-in reads and writes native metadata items in most PDF files (based on PDF Reference documentation). It has been tested with PDF formats 1.2 to 1.7. The plug-in consists of the file ScavPI.PDF.dll.
Supported RawSourceElements
The plug-in recognizes all common descriptive metadata items inside a PDF file:
| RSE DescriptorPath | Type | Description |
| |PDF|PDF|Title| |
String | The document's title. (Optional; PDF 1.1 and higher). Also returns the encoding information (i.e., "ANSI")to FileMind. |
| |PDF|PDF|Author| | String | The name of the person who created the document. (Optional). Also returns the encoding information (i.e., "ANSI")to FileMind. |
| |PDF|PDF|Subject| | String | The subject of the document. (Optional; PDF 1.1 and higher) |
| |PDF|PDF|Keywords| |
String | Keywords associated with the document. (Optional; PDF 1.1 and higher) |
| |PDF|PDF|CreationDate| | DateTime (read-only) |
The date and time when the document was created. (Optional) |
| |PDF|PDF|ModDate| | DateTime (read-only) |
The date and time when the document was recently modified. (Optional in most files; PDF 1.1 and higher) |
| |PDF|PDF|Language| | String (read-only) |
|
| |PDF|PDF|Thumbnails| | Thumbnail Array (read-only) |
One or more document previews embedded into the PDF file as thumbnail graphics. |
In addition, it will also show administrative and structural information; along with user-defined custom items (supported by some PDF producing tools, i.e., PDF/X fields):
| RSE DescriptorPath | Type | Description |
| |PDF|PDF|Creator| |
String |
The name of the application (i.e., Microsoft Word) that created the original document from which it was converted (Optional). Also returns the encoding information (i.e., "ANSI")to FileMind. |
| |PDF|PDF|Producer| | String | The name of the tool (i.e., Acrobat Distiller or PDF Creator) that converted the document into PDF (Optional). Also returns the encoding information (i.e., "ANSI")to FileMind. |
| |PDF|PDF|Trapped| | String |
An indicator for included trapping information. (Optional; PDF 1.3 and higher) |
| |PDF|PDF|Version| | String (read-only) |
The PDF version of the file. This RSE is not extracted from inside the file; the plug-in identifies the PDF version and returns it as a metadata item. |
| |PDF|PDF|Encrypted| | Boolean (read-only) | Indicates if the PDF document is encrypted. |
| |PDF|PDF|EncryptedRevision| | Numeric (read-only) | |
| |PDF|PDF|EncryptMetadata| | Boolean (read-only) | |
| |PDF|PDF|AllowPrinting| | Boolean (read-only) | Indicates if the PDF document can be printed. |
| |PDF|PDF|AllowModify| | Boolean (read-only) | Indicates if the PDF document can be modified or annotated. |
| |PDF|PDF|AllowCopy| | Boolean (read-only) | |
| |PDF|PDF| AllowAnnotationModify| |
Boolean (read-only) | Indicates if the PDF document can be marked up (annotated). |
| |PDF|PDF| AllowFillInteractiveForms| |
Boolean (read-only) | Indicates if the PDF document is a form with fields that can be filled. |
| |PDF|PDF| AllowExtractTextAndGraphics| |
Boolean (read-only) | Indicates if elements inside the PDF document (text, pictures) can be copied and pasted. |
| |PDF|PDF| AllowAssembleDocument| |
Boolean (read-only) | |
| |PDF|PDF| AllowHighQualityPrinting| |
Boolean (read-only) | Indicates if the PDF document supports printing at printer resolution quality (as opposed to screen resolution). |
| |PDF|PDF|PagesCount| | Numeric (read-only) | Number of pages |
| |PDF|PDF|PageHeight| | Floating (read-only) | |
| |PDF|PDF|PageWidth| | Numeric (read-only) | |
| |PDF|PDF|PageUnits| | String (read-only) | Page resolution (i.e., "1/72'). Note: the raw value is formatted with ScavAPI.Common.Formatters.RationalFormatter. |
| |PDF|PDF|PageUnitsMetric| | String (read-only) | Name of the measuring unit (i.e., "Inches"). |
Metadata Streams: XMP items inside of PDF files
Files written in PDF 1.4 or higher allow the embedding of one or more Metadata Streams with XMP information, which can be extracted by the XMP plug-in. PDF files that are compliant with the archiving standard PDF/A must contain certain XMP metadata items. Beta Note: not yet fully implemented.
Metadata Scrubbing
Native PDF metadata items (located in the document information dictionary section of the file) cannot be scrubbed. Most items can be manually overwritten with an empty string, however.
Removing XMP Metadata Streams is difficult and requires a complete rewriting of the file's content, which is not currently supported by this plug-in (or the XMP plug-in). However, XMP items can be manually overwritten with empty values. Note that this does NOT physically remove any older XMP Metadata Streams or streams that belong to embedded objects (image files inside a PDF file can also have its own XMP metadata). PDF files that have been incrementally saved can have multiple packets that all look like the “main” XMP Metadata Stream. During an incremental save, new data is written to the end of the file without removing the old.
Adobe Acrobat Reader
The screenshot below shows how RSEs known to the plugin relate to information that is shown on the Description tab of the document properties dialog in Adobe Acrobat Reader 9.2:

