Graphics Programs Reference
In-Depth Information
CHAPTER 3
File Structure
In this chapter, we describe the layout and content of the PDF file's four main sections,
and the syntax of the objects which make up each one. We also outline the process of
reading a PDF file into a high level data structure, and the converse operation of writing
that structure to a PDF file.
File Layout
A simple valid PDF file has four parts, in order:
1. The header , which gives the PDF version number.
2. The body , containing the pages, graphical content, and much of the ancillary in-
formation, all encoded as a series of objects .
3. The cross-reference table , which lists the position of each object within the file, to
facilitate random access.
4. The trailer including the trailer dictionary , which helps to locate each part of the
file and lists various pieces of metadata which can be read without processing the
whole file.
For reference, we reproduce the “Hello, World” PDF from Chapter 2 as Example 3-1 .
The first line of each of the four sections has been annotated.
Example 3-1. A small PDF file
%PDF-1.0 Header starts here
%âãÏÓ
1 0 obj Body starts here
<<
/Kids [2 0 R]
/Count 1
/Type /Pages
>>
endobj
2 0 obj
<<
 
Search WWH ::




Custom Search