Things you can do with PDF (iText 5)

Let’s start with six quick facts about PDF:

■ PDF is the Portable Document Format.

■ It’s an open file format (ISO-32000-1), originally created by Adobe.

■ It’s used for documents that are independent of system software and hardware.

■ PDF documents are an essential part of the web.

■ Adobe Reader is the most widely used PDF viewer.

■ There are a lot of free and proprietary, open and closed source, desktop and web-based software products for creating, viewing, and manipulating PDF documents.

Figure 1.1 offers an overview of the things you can do with PDF. There are tools to create PDF documents, there are applications to consume PDF documents, and there are utilities to manipulate existing PDF documents.

If you look at PDF creation, you’ll find that graphical designers use desktop applications such as Adobe Acrobat or Adobe InDesign to create a document in a manual or semimanual process. In another context, PDF documents are created programmatically, using an API to produce PDFs directly from software applications, without—or with minimal—human intervention. Sometimes the document is created in an intermediary format first, then converted to PDF. These different approaches demand different software products. The same goes for PDF manipulation. You can update a PDF manually in Adobe Acrobat, but there are also tools that allow forms to be filled out automatically based on information from a database.


Overview of PDF-related functionality. The functionality covered by iText is marked with the iText logo.

Figure 1.1 Overview of PDF-related functionality. The functionality covered by iText is marked with the iText logo.

This topic will focus on the automation side of things: we’ll create and manipulate PDF documents in an automated process using iText. The functionality covered by iText in figure 1.1 is marked with the iText logo. A smaller logo indicates that the functionality is only partly supported.

Typically, iText is used in projects that have one of the following requirements:

The content isn’t available in advance: it’s calculated based on user input or real-time database information.

■ The PDF files can’t be produced manually due to the massive volume of content: a large number of pages or documents.

■ Documents need to be created in unattended mode, in a batch process.

■ The content needs to be customized or personalized; for instance, the name of the end user has to be stamped on a number of pages.

Often you’ll encounter these requirements in web applications, where content needs to be served dynamically to a browser. Normally, you’d serve this information in the form of HTML, but for some documents, PDF is preferred over HTML for better printing quality, for identical presentation on a variety of platforms, for security reasons, or to reduce the file size. In this case, you can serve PDF on the fly.

As you read this topic, you’ll create and manipulate hundreds of PDF documents that demonstrate how to use a specific feature, how to solve common and less common issues, and how to build an application that involves PDF technology. We’ll use iText because it’s an API that was developed to allow developers to do the following (and much more):

■ Generate documents and reports based on data from an XML file or a database

■ Create maps and books, exploiting numerous interactive features available in PDF

■ Add bookmarks, page numbers, watermarks, and other features to existing PDF documents

■ Split or concatenate pages from existing PDF files

■ Fill out interactive forms

■ Serve dynamically generated or manipulated PDF documents to a web browser

For first-time users, this topic is indispensable. Although the basic functionality of iText is easy to grasp, the first parts of this topic significantly lower the learning curve and gradually offer more advanced functionality.

It’s also a must-have for the many developers who are already familiar with iText. In the final topics, many PDF secrets hidden in ISO-32000-1, the open standard that defines the Portable Document Format, will be unveiled. Even experienced iText developers will learn new ways to master the PDF specification using their favorite PDF library.

Without further ado, let’s start with a simple example that explains how to compile and run the many examples that come with this topic.

Next post:

Previous post: