Java Reference
In-Depth Information
is one of a small list of basic constraints detailed in the XML specification. Any XML file
that satisfies all of these constraints is said to be well-formed and is accepted by an XML
parser. A document that is not well-formed will be rejected by an XML parser.
Speaking of XML parsing, quite a few XML parsers are available. A parser is simply a pro-
gram or class that reads an XML file, looks at it at least syntactically, and lets you access
some or all of the elements. Most of these parsers in the Java world conform to the Java
bindings for one of the two well-known XML APIs, SAX and DOM. SAX, the Simple API
for XML, reads the file and calls your code when it encounters certain events, such as start-
of-element, end-of-element, start-of-document, and the like. DOM, the Document Object
Model, reads the file and constructs an in-memory tree or graph corresponding to the ele-
ments and their attributes and contents in the file. This tree can be traversed, searched, modi-
fied (even constructed from scratch, using DOM), or written to a file.
An alternative API called JDOM has also been released into the open source field. JDOM,
originally by Brett McLaughlin and Jason Hunter and now shepherded by Rolf Lear, has the
advantage of being aimed primarily at Java (DOM itself is designed to work with many dif-
ferent programming languages).
But how does the parser know if an XML file contains the correct elements? Well, the sim-
pler, “nonvalidating” parsers don't—their only concern is the well-formedness (see the fol-
lowing list) of the document. Validating parsers check that the XML file conforms to a given
Document Type Definition (DTD) or an XML Schema. DTDs are inherited from SGML;
their syntax is discussed in Verifying Structure with Schema or DTD . Schemas are newer
than DTDs and, though slightly more complex, provide more flexibility, including such
object-based features as inheritance. DTDs are written in a special syntax derived from
SGML's document type definition specification, whereas XML Schemas are expressed using
ordinary XML elements and attributes.
These definitions give more precise meaning to terms used with XML:
Well Formed
An XML document that conforms to the syntax of all XML documents (i.e., one root ele-
ment, correct tag/element syntax, correct nesting, etc.).
Valid
An XML document that in addition to being well-formed has been tested to conform to
the requirements of an XML schema (or DocType).
Search WWH ::




Custom Search