Chapter 19: Accessing XML Documents Using SQL
In This Chapter
XML (or eXtensible Markup Language) has become increasingly popular for a variety of applications
ranging from platform-independent data transfer, exemplified by the legal invoicing example illustrated
in Chapters 11 and 18 , to use in configuration files such as the web.xml file used by the Tomcat server.
XML documents are in many ways similar to the HTML documents familiar from Web applications.
The primary difference between XML and HTML is that XML documents are based on user-defined tags,
whereas HTML tags are predefined for use by the browser. An important secondary difference is that
XML documents must be well formed in order to be machine readable.
To be well formed, a document must follow a few simple rules. The most important of these are that all
tags must be properly closed, and that when tags are nested they must be nested correctly. A properly
closed tag is a tag that either has a closing tag after its contents, or is self-closing. The following code
snippet shows examples of two properly closed tags:
Proper nesting requires that nested tags be closed in the opposite order to the order in which they were
opened. In the example below, the nested element is nested inside the tag element, and is closed
before the tag element is closed:
These rules are similar to the rules a programmer is used to following when using braces or
parentheses. However, it is important to realize that unlike HTML, which lets you get away with breaking
these rules, the XML parser requires that the rules be obeyed.
By enforcing the basic rules of well-formed documents, XML defines a structure which can be parsed
very easily with no knowledge of the content of a document. HTML parsers, on the other hand, can
handle ill formed documents because a knowledge of the meanings of the HTML tags is built into the
Because XML documents are well formed, they can have an inherently tabular structure, which makes
them ideal for representing data tables. This chapter explores the design of a simple JDBC driver that
exploits this structure to use XML documents as the data storage element of a simple database.
Reasons for Accessing XML Documents with SQL
Although the primary use of XML is to provide a platform-independent way to structure data for transfer
between applications, an important secondary use of XML is for local data storage. Common examples
include the following:
XML as a replacement for properties files or INI files
XML as a replacement for comma-delimited CSV files in text databases
XML as a small, downloadable database for the delivery of stock quotes or news headlines
In some instances, an XML document, being a data repository, can be a database in itself. For example,
the contact lists on my Linux-based PDA are saved as XML documents.
Since the data in an XML file is stored in two different node types, there are two obvious ways to set up
a database using an XML file:
Store each record as an element, with the field data in attributes.