Java Reference
In-Depth Information
perficial appearance is where the similarity between XML and HTML ends. XML and HTML are pro-
foundly different in purpose and capability.
The Purpose of XML
Although an XML document can be created, read, and understood by a person, XML is primarily for com-
municating data from one computer to another. XML documents are therefore more typically generated and
processed by computer programs. An XML document defines the structure of the data it contains so a pro-
gram that receives it can properly interpret it. Thus XML is a tool for transferring information and its struc-
ture between computer programs. HTML, on the other hand, is solely for describing how data should look
when it is displayed or printed. The structuring information that appears in an HTML document relates to
the presentation of the data as a visible image. The purpose of HTML is data presentation.
HTML provides you with a set of tags that is essentially fixed and geared to the presentation of data.
XML is a language in which you can define new sets of tags and attributes to suit different kinds of data
— indeed, to suit any kind of data including your particular data. Because XML is extensible, it is often
described as a meta-language — a language for defining new languages, in other words. The first step in
using XML to exchange data is to define the language that you intend to use for that purpose in XML.
Of course, if I invent a set of XML markup to describe data of a particular kind, you need to know the
rules for creating XML documents of this type if you want to create, receive, or modify them. As you later
see, the definition of the markup that has been used within an XML document can be included as part of the
document. It also can be provided as a separate entity, in a file identified by a URI, for example, that can
be referenced within any document of that type. The use of XML has already been standardized for very di-
verse types of data. XML languages exist for describing the structures of chemical compounds and musical
scores, as well as plain old text such as in this topic.
Processing XML in Java
The JAXP provides you with the means for reading, creating, and modifying XML documents from within
your Java programs. To understand and use this application program interface (API) you need to be reason-
ably familiar with two basic topics:
• What an XML document is for and what it consists of
• What a DTD is and how it relates to an XML document
You also need to be aware of what an XML namespace is, if only because JAXP has methods relating to
handling these. You can find more information on JAXP at http://jaxp.java.net .
In case you are new to XML, I introduce the basic characteristics of XML and DTDs before explaining
how you apply some of the classes and methods provided by JAXP to process XML documents. I also
briefly explore what XML namespaces are for. If you are already comfortable with these topics you can skip
most of this chapter and pick up where I start talking about SAX. Let's start by looking into the general
organization of an XML document.
XML DOCUMENT STRUCTURE
An XML document basically consists of two parts, a prolog and a document body :
Search WWH ::




Custom Search