Java Reference
In-Depth Information
as word-processing files and Java documents. XML is all these things, depending on where
you're coming from as a developer and where you want to go today—and tomorrow.
Because it is text, XML can be generated from Java in a number of ways. For very simple
cases, you can just use good old
out.println()
, but this is not recommended. The Java Ar-
chitecture for XML Binding (JAXB, see
Converting Between Objects and XML with
JAXB
)
, and the XML Serializers (
Converting Between Objects and XML with Serializers
)
provide mechanisms for moving information in both directions between Java objects and
XML documents. Some other third-party packages provide this as well, but we'll keep the
coverage to these two.
Because of the wide acceptance of XML, it is used as the basis for many other formats, in-
cluding the
Open Office
save file format, the SVG graphics file format, and many more.
From SGML, both HTML and XML inherit the syntax of using angle brackets (
<
and
>
)
around
tags
, each pair of which delimits one part of an XML document, called an
element
.
An element may contain content (like a
<P>
tag in HTML) or may not (like an
<hr>
in
HTML). Whereas HTML documents can begin with either an
<html>
tag or a
<DOCTYPE…>
tag (or, informally, with neither), an XML file may begin with an XML declaration. Indeed,
it must begin with an XML processing instruction (
<? … ?>
) if the file's character encoding
is other than UTF-8 or UTF-16:
<?xml version="1.0" encoding="iso-8859-1"?>
The question mark is a special character used to identify the XML “processing instruction”
(it's syntactically similar to the % used in ASP and JSP).
HTML has a number of elements that accept attributes, such as those in this (very old) web
page:
<BODY bgcolor=white> ... </body>
In XML, attribute values (such as the 1.0 for the version in the processing instruction or the
white
of
BGCOLOR
) must be quoted. In other words, quoting is optional in HTML, but re-
quired in XML.
The
BODY
example shown here, though allowed in traditional HTML, would draw complaints
from any XML parser. XML is case sensitive; in XML,
BODY
,
Body
, and
body
represent three
different element names. In addition, each XML start tag must have a matching end tag. This