Database Reference
In-Depth Information
10
The problem of XML data exchange
In this chapter we shall study data exchange for XML documents. XML itself was invented
as a standard for data exchange on the Web, albeit under a different interpretation of the
term “data exchange”. In the Web context, it typically refers to a common, flexible format
that everyone agrees on, and that, therefore, facilitates the transfer of data between different
sites and applications. When we speak of data exchange, we mean transforming databases
under different schemas with respect to schema mapping rules, and querying the exchanged
data.
10.1 XML documents and schemas
In this section we review the basic definitions regarding XML. Note that a simple example
was already shown in Chapter 1 . XML documents have a hierarchical structure, usually
abstracted as a tree. An example is shown in Figure 10.1 . This document contains infor-
mation about rulers of European countries. Its structure is represented by a labeled tree; in
this example, the labels are europe , country ,and ruler . In the XML context, these are
referred to as element types . We assume that the labels come from a finite labeling alphabet
and correspond, roughly, to relation names from the classical relational setting.
The root of the tree is labeled europe , and it has two children that are labeled country .
These have data values , given in parentheses: the first one is Scotland , and the second
one is England . Each country in turn has a set of rulers. That is, the children of each
country node are labeled ruler , and have associated data values assigned to them, for
example, James V . These data values come from a potentially infinite set (e.g., of strings,
or numbers). We also assume that, in general, children of each node are ordered ; normally
this order is interpreted as going from left to right in the picture. That is, James V is the
first child of the Scotland node, and Charles I is the last. In our example, this corresponds
to the chronological order.
In general, a node may have more than one data value. We assume, under the analogy
between node labels and relation names, that each node has some attributes that store data
values associated with it.
Search WWH ::




Custom Search