(v13) Inspecting PDF files
This page applies to Harlequin v13.1r0 and later; and to Harlequin Core but not Harlequin MultiRIP.
It is possible to inspect the contents of PDF files, before optionally proceeding to image the contents. There is no way to alter the PDF file but there is an opportunity to alter the imaging parameters.
A typical sequence of use is:
- Call
pdfopen
. This opens a PDF execution context on an open file and identifies an initial set of imaging parameters. (The execution context is a concept specific to the RIP, providing a simple way of identifying the opened file and associated PDF parameters.) - Call
getPDFtrailer
. This provides basic file information and access to the file’s Catalog, and from there to the hierarchy of objects in the PDF file. - Call
getPDFobject
, using the information obtained fromgetPDFtrailer
. Repeat this call for all interesting objects until you have the information you require. - Optionally, call
pdfexecid
. This images the file, potentially with alterations to the PDF parameters. You can make these alterations to PDF parameters and, possibly, prior changes to thepagedevice
based on information obtained in the earlier steps of this sequence. For example, you might alter the PDFPageRange
or choose a different kind of media. - Call
pdfclose
. You must always do this to close the PDF execution context on the file. GGS suggests that you also need aclosefile.
The PDF objects that your code reads are cached by the RIP, freeing your code from the need to maintain a separate cache for efficient repeated reads of the same objects.
It is possible to open a PDF execution context for each of several files.
The sequence just described assumes that your code starts by knowing that the job is PDF. It is also possible to discover while processing a job that it is PDF. If this is the case, the RIP can use the operator getPDFcontext
to find the context identifier for subsequent use of getPDFtrailer
and getPDFobject
.