Adding content with PdfStamper Part 2 (iText 5)

Filling out a PDF form

There are different flavors of forms in PDF. We’ll discuss the details in topic 8, where we’ll create forms using iText. For now, we’re going to use another tool to create an interactive PDF form.

CREATING A FORM WITH OPEN OFFICE

Figure 6.10 shows how you can use Open Office to create an XML form document. Using the Form Controls toolbar, you can add different kinds of form fields. Figure 6.11 shows a Film Data Sheet. It has text fields for the title, director, year, and duration. It has check boxes for the locations, because one movie can be screened in different movie theaters during the film festival. Finally, it has radio buttons for the category, because each film in the selection belongs to only one category. The properties for each of these fields—name, possible values, and so on—are set in a separate Properties dialog box.

In the previous example, the TOC consists of only two pages; the actual content consists of 39 pages. What if you want to reorder the pages?

Creating an XML form document with Open Office Writer


Figure 6.10 Creating an XML form document with Open Office Writer

Creating fields in an Open Office document

Figure 6.11 Creating fields in an Open Office document

When you create such a document, you may want to save it as an ODT file first. This will allow you to edit the document afterwards, in case something has to be changed. Then choose File > Export as PDF to open the PDF Options dialog box shown in figure 6.12.

Make sure that the check box next to Create PDF Form is checked. The resulting PDF document will be a form, as shown in figure 6.13.

This is an interactive form. You can start entering data manually into the fields you defined. However, when using Adobe Reader, you’ll get a message saying, "You cannot save data typed into this form." In section 9.2, you’ll see how data entered in a form that has a Submit button can be posted to a server, but the film data sheet you’re using in this topic was created for a different purpose: you’re going to fill it out pro-grammatically, using iText and PdfStamper. That is, after you’ve learned how to inspect the form.

Exporting an Open Office document as a PDF form

Figure 6.12 Exporting an Open Office document as a PDF form

A form created with Open Office Writer

Figure 6.13 A form created with Open Office Writer

INSPECTING THE FORM AND ITS FIELDS

If you want to fill out the form using iText, you need to know the name of each field you want to fill out. In the case of check boxes and radio buttons, you also need to know the different values that can be chosen. You know these names and values if you’ve created the form yourself, but in most cases the form will be created by a graphical designer. As a developer, you’ll have to inspect the form to find out which names were used.

Listing 6.18 shows the different types of fields you can encounter. These types will be discussed in detail in topic 8, except for signature fields, which will be discussed in t 12.

Listing 6.18 FormInformation.java

Listing 6.18 FormInformation.java Listing 6.18 FormInformation.java

The result when executing this code for the form shown in figure 6.13 looks like this:

tmp89-22_thumb[2]tmp89-23_thumb[1][2]

Note that the movie theaters are stored in the database like this: CP.1, GP.3, MA.3, … But when you define the check boxes using Open Office (as in figure 6.11), you replace the dot with an underscore character because the dot character is forbidden in field names.

A check box has two possible values that correspond with an appearance state. In the case of the locations, the value can be Off—the check box isn’t checked—or Yes—the check box is checked. These values can vary from PDF to PDF, so it’s important to check the possible states before you start filling out the form. The possible values for the group of radio buttons is either Off—no radio button is selected—or a code that corresponds with the keyword field in the festival_category table (see figure 3.4).

Now that you’ve inspected the form, you have enough information to fill it out using iText. FILLING OUT THE FORM

Filling out forms programmatically is usually done for two reasons: prefilling data in an editable form, and presenting information in a standard layout.

Imagine an online insurance company. When a customer wants to report an incident, they can log in, and choose among a number of PDF forms. These forms contain a number of standard fields with content that’s already present in the company’s database: name, address, and so on. When the customer logs in, the application could have access to this information, so why require the customer to enter all this information manually? Wouldn’t it be better to take the blank form and prefill part of the information to save time for the customer?

That’s what’s done in figure 6.14. The film data sheet is filled with data from the database, but the data is still editable. In the context of an insurance company, the customer’s phone number could be filled in, but the customer could still change it in case his number has changed.

Another typical use of PDF forms is when you want to use the form as a standard template. You don’t really need a form to communicate with an end user. You just want to create documents that share the same structure, but with differing content.

The PDF shown in figure 6.15 was made using the Film Data Sheet form, but it’s no longer interactive. The form has disappeared. The fields were only used as placeholders for the film title, director, and so on.

The process of keeping the data but removing the form is called flattening, and there are different possibilities in-between. You can choose to flatten only specific fields, or you can change the status of specific fields to read-only. For instance, a customer of an insurance company is allowed to change their telephone number on the prefilled form, but not their name. Flattening will be discussed in topic 8; in this topic, you’ll only use the basic mechanism of form filling.

A form filled out using iText

Figure 6.14 A form filled out using iText

A form filled out and flattened using iText

Figure 6.15 A form filled out and flattened using iText

tmp89-26_thumb[1][3]

In this listing, you’re creating a separate document for every movie in the database that was made after 2006. The new reader instance is created inside the loop.

FAQ Why do I get a DocumentException saying "The original document was reused. Read it again from file."? Every PdfReader object can be used for one and only one PdfStamper object. Looking at the example in listing 6.19, you might argue that new PdfReader(DATASHEET) could be moved outside the loop, because it’s the same for all the PdfStamper objects, but that won’t work. As soon as you use a PdfReader object to create a PdfStamper, the reader object is tampered. You can check this by adding the line reader.isTampered();. If this method returns true, you can’t use the reader to create a new stamper object. You have to create a new instance— which is exactly what the error message tells you.

If you want to fill out a form, you need to have an AcroFields object. You can get an instance of this object using the method getAcroFields().

FAQ Why do I get a DocumentException saying "This AcroFields instance is readonly?" If you look closely at listings 6.18 and 6.19, you’ll see that the getAc-roFields() method exists in the PdfReader class as well as in the Pdf-Stamper class. The AcroFields retrieved in listing 6.18 is read-only, and it will throw a DocumentException as soon as you try to fill out a field. You need to use the method with PdfStamper if you want to update the form.

Filling out the form is easy. If you know the field name, such as "title", you can set its value using only one line:

tmp89-27_thumb[2]

As you can see in listing 6.19 O, the filled-out data sheets of movies dating from 2007 are flattened. Figure 6.15 shows such a data sheet. It looks like an ordinary PDF file. The content is stamped on the document; it’s no longer an editable form. In figure 6.14, you see a data sheet for a movie made in 2008. It’s still a form; you can change the title manually.

There’s much more to say about forms, but we can’t go into further detail until we’ve talked about annotations. Also, I haven’t said anything about the different types of PDF forms yet: there are forms based on AcroForm technology (like the form you created using Open Office), and there are XFA forms (created with Adobe Designer). This will have to wait until topic 8, because we have one more group of PDF manipulation classes left to cover.

Next post:

Previous post: