Java Reference
In-Depth Information
Designing an XML Dialect
Although XML is described as a language and is compared with Hypertext Markup
Language (HTML), it's actually much larger in scope than that. XML is a markup lan-
guage that defines how to define a markup language.
That's an odd distinction to make, and it sounds like the kind of thing you'd encounter in
a philosophy textbook. This concept is important to understand, though, because it
explains how XML can be used to define data as varied as health-care claims, genealogi-
cal records, newspaper articles, and molecules.
The “X” in XML stands for Extensible, and it refers to organizing data for your own pur-
poses. Data that's organized using the rules of XML can represent anything you want:
A programmer at a telemarketing company can use XML to store data on each out-
going call, saving the time of the call, the number, the operator who made the call,
and the result.
n
A lobbyist can use XML to keep track of the annoying telemarketing calls she
receives, noting the time of the call, the company, and the product being peddled.
n
A programmer at a government agency can use XML to track complaints about
telemarketers, saving the name of the marketing firm and the number of com-
plaints.
n
Each of these examples uses XML to define a new language that suits a specific purpose.
Although you could call them XML languages, they're more commonly described as
XML dialects or XML document types .
19
An XML dialect can be designed using a Document Type Definition (DTD) that indi-
cates the potential elements and attributes that it covers.
A special !DOCTYPE declaration can be placed in XML data, right after the initial ?xml
tag, to identify its DTD. Here's an example:
<!DOCTYPE Library SYSTEM “librml.dtd”>
The !DOCTYPE declaration is used to identify the DTD that applies to the data. When a
DTD is present, many XML tools can read XML created for that DTD and determine
whether the data follows all the rules correctly. If it doesn't, it is rejected with a refer-
ence to the line that caused the error. This process is called validating the XML .
One thing you'll run into as you work with XML is data that has been structured as
XML but wasn't defined using a DTD. Most versions of RSS files do not require a DTD.
This data can be parsed (presuming it's well-formed), so you can read it into a program
 
Search WWH ::




Custom Search