Java Reference
In-Depth Information
You also need to be aware of what an XML namespace is, if only because JAXP has methods relating to
handling these. You can find more information on JAXP at http://sun.java.com/xml/jaxp/ .
Just in case you are new to XML, we will briefly explore the basic characteristics of XML and DTDs
before we start applying the classes and methods provided by JAXP to process XML documents. We
will also briefly explore what XML namespaces are for. If you are already comfortable with these topics
you can skip most of this chapter and pick up where we start talking about SAX. Let's start by looking
into the general organization of an XML document.
XML Document Structure
An XML document basically consists of two parts, a prolog and a document body :
The prolog provides information necessary for the interpretation of the contents of the
document body. It contains two optional components, and since you can omit both, the prolog
itself is optional. The two components of the prolog, in the sequence in which they must
appear, are:
An XML declaration that defines the version of XML that applies to the document, and may also
specify the particular Unicode character encoding used in the document and whether the
document is standalone or not. Either the character encoding or the standalone specification can be
omitted from the XML declaration but if they do appear they must be in the given sequence.
A document type declaration specifying an external Document Type Definition (DTD) that
identifies markup declarations for the elements used in the body of the document, or explicit
markup declarations, or both.
The document body contains the data. It comprises one or more elements where each element
is defined by a begin tag and an end tag. The elements in the document body define the
structure of the data. There is always a single root element that contains all the other elements.
All of the data within the document is contained within the elements in the document body.
Processing instructions ( PI ) for the document may also appear at the end of the prolog and at the end of
the document body. Processing instructions are instructions intended for an application that will process
the document in some way. You can include comments that provide explanations or other information
for human readers of the XML document as part of the prolog and as part of the document body.
When an XML document is said to be well-formed , it just means that it conforms to the rules for writing
XML, as defined by the XML specification. Essentially an XML document is well-formed if its prolog
and body are consistent with the rules for creating these. In a well-formed document there must be only
one root element and all elements must be properly nested. We will summarize more specifically what
is required to make a document well-formed a little later in this chapter, after we have looked into the
rules for writing XML.
Search WWH ::




Custom Search