The problem of XML data exchange - Foundations of Data Exchange

Database Reference

In-Depth Information

10

The problem of XML data exchange

In this chapter we shall study data exchange for XML documents. XML itself was invented

as a standard for data exchange on the Web, albeit under a different interpretation of the

term “data exchange”. In the Web context, it typically refers to a common, flexible format

that everyone agrees on, and that, therefore, facilitates the transfer of data between different

sites and applications. When we speak of data exchange, we mean transforming databases

under different schemas with respect to schema mapping rules, and querying the exchanged

data.

10.1 XML documents and schemas

In this section we review the basic definitions regarding XML. Note that a simple example

was already shown in Chapter 1 . XML documents have a hierarchical structure, usually

abstracted as a tree. An example is shown in Figure 10.1 . This document contains infor-

mation about rulers of European countries. Its structure is represented by a labeled tree; in

this example, the labels are europe , country ,and ruler . In the XML context, these are

referred to as element types . We assume that the labels come from a finite labeling alphabet

and correspond, roughly, to relation names from the classical relational setting.

The root of the tree is labeled europe , and it has two children that are labeled country .

These have data values , given in parentheses: the first one is Scotland , and the second

one is England . Each country in turn has a set of rulers. That is, the children of each

country node are labeled ruler , and have associated data values assigned to them, for

example, James V . These data values come from a potentially infinite set (e.g., of strings,

or numbers). We also assume that, in general, children of each node are ordered ; normally

this order is interpreted as going from left to right in the picture. That is, James V is the

first child of the Scotland node, and Charles I is the last. In our example, this corresponds

to the chronological order.

In general, a node may have more than one data value. We assume, under the analogy

between node labels and relation names, that each node has some attributes that store data

values associated with it.

Search WWH ::

Custom Search

Home