Java Reference
In-Depth Information
T HE B ASICS
XML is an acronym for Extensible Markup Language. It is simply a language spec-
ification for documents that describe and contain data. XML's designers have at-
tempted to combine the simplicity and the ubiquity of HTML with the rich
descriptive capabilities of Standard Generalized Markup Language (SGML). HTML
and XML are, in fact, both SGML document types.
An XML document is a text file that conforms to the XML language specifica-
tion. It contains data in a structured format and descriptive information about the
data. The primary role of an XML document is to present data generated by one ap-
plication (or system) to another. Consequently, XML documents are well suited as
general-purpose data repositories and data transport containers, as well as common
structures such as configuration files.
XML VS . HTML
XML provides more control than HTML primarily by allowing a document to de-
scribe its own tags (similar to a data type or, significantly, a record type). This ca-
pability allows a document to organize its data in a structured format. An XML
document can also contain enough metadata (information about the data) so that
any application can reliably parse the document and extract the data from the doc-
ument.
In contrast, HTML is designed to describe documents in a format suitable for
end-user viewing in a graphical browser. HTML documents do not contain infor-
mation about the meaning of the data, nor are they structured in a way that makes
it easy for a program to analyze. Therefore, an application may have a difficult time
extracting relevant data from an HTML document.
A relatively simple example makes this point. Here is a portion of an HTML
page that might be generated by an Internet book retailer. It informs an Internet
browser how to represent the current contents of the shopping cart page to a po-
tential purchaser.
<td bgcolor="#FFFFFF" width="51%">
<a href="../81332713233407">
<em>Debt of Honor</em></a>
<br>
Tom Clancy;
Paperback</b>
<font size=2 face="Verdana, Helvetica, Courier" color=#000000>
<NOBR>Price: <font color=#990>$6.99</font></b></NOBR><br>
 
Search WWH ::




Custom Search