Graphics Programs Reference
In-Depth Information
Basic PDF Syntax
A PDF file contains at least three distinct languages:
• The
document content
, which is a number of objects with links between them
forming a
directed graph
. These objects describe the structure of the document
(pages, metadata, fonts, and resources).
• The
page content
, described using a series of operators for placing text and graphics
on a single page.
• The
file structure
, consisting of a
header
,
trailer
, and
cross-reference table
helping
programs to locate and read the file's contents.
Document Content
The document content consists of objects built out of, amongst others, the following
elements:
• Names, written as
/Name
.
• Integers, like
50
.
• Strings, introduced with brackets, like
(The Quick Brown Fox)
.
• References to other objects like
2 0 R
, a reference to object 2.
• Arrays (ordered collections) of objects, like
[50 30 /Fred]
, an array of three items,
in order:
50
,
30
, and
/Fred
.
• Dictionaries (unordered maps from names to objects), like
<< /Three 3 /Five 5
>>
, which maps
/Three
to
3
and
/Five
to
5
.
• Streams, which consist of a dictionary and some binary data. These are used to
store streams of PDF graphics operators, and other binary data such as images and
fonts.
For example, here's a
page object
, which is a dictionary containing a number of items,
each associated with a name:
<< /Type /Page
/MediaBox [0 0 612 792]
/Resources 3 0 R
/Parent 1 0 R
/Contents [4 0 R]
>>
This dictionary contains five entries:
/Type /Page
The name
/Page
is associated with the dictionary key
/Type
.
/MediaBox [0 0 612 792]
The array of four integers
[0 0 612 792]
is associated with the dictionary key
/
MediaBox
.