Databases Reference
In-Depth Information
equivalences [ Fagin et al. 2008 ]. These optimizations are very important in appli-
cations, in which mappings are required to be minimal, for efficiency reasons. We
discuss the recent approaches [ Gottlob et al. 2009 ; Fagin et al. 2008 ] in Sect. 6.4 .
6.1
Bridging Data and Metadata
HePToX [ Bonifati et al. 2010 , 2005 ] has been the first system to introduce data-
metadata correspondences that drive the trasformation from the schema components
in the source schema to the instance values in the target schema and vice-versa. Such
novel correspondences enrich the semantics of the transformation, while at the same
time posing new research challenges. HePToX uses a Datalog-based mapping lan-
guage called TreeLog; being an extension of SchemaLog, it is capable of handling
schema and data at par. TreeLog expressions have been inferred from arrows and
boxes between elements in the source schema and instances in the target schema
that rely on an ad-hoc graphical notation. By virtue of a bidirectional semantics for
query answering, correspondences also involving data-metadata conflicts can be tra-
versed by collecting the necessary components to answer the queries. Queries are
expressed in XQuery and the underlying data is expressed in XML to maintain the
connection with TreeLog expressions, which are intrinsically nested.
Recently, MAD (MetadatA-Data) mappings [ Hernandez et al. 2008 ] have been
studied as useful extensions in Clio [ Popa et al. 2002 ], which extend the basic map-
pings expressed as s-t tgds. Contrary to HePToX, such mappings are used for data
exchange. To this purpose, output dynamic schemas are defined, since the result of
data exchange cannot be determined a priori whenever it depends on the instances.
MAD mappings in Clio are also generated from visual specifications, similarly to
HePToX and then translated to executable trasformations. The translation algorithm
is a two-step algorithm in which the first step “shreds” the source data into views that
offer a relational partitioning of the target schema, and the second step restructures
the result of the previous step by also taking into account user-defined grouping in
target schema with nested sets.
To summarize, Clio derives a set of MAD mappings from a set of lines between a
source schema and a target schema. Applying these transformations computes a tar-
get instance that adheres to the target schema and to the correspondences. Similarly,
HePToX derives a set of TreeLog mapping rules from element correspondences (i.e.,
boxes and arrows) between two schemas. TreeLog rules are similar in spirit to s-t
tgds, although TreeLog has a second-order syntax. However, the problems solved by
Clio and HePToX are different. In Clio, the goal is data exchange, while in HePToX
turns to be query reformulation in a highly distributed setting, as we will further
discuss in Sect. 6.3 .
Search WWH ::




Custom Search