Databases Reference
In-Depth Information
data. Their output looks like an ETL flowchart. ETL systems require no large intel-
ligent capabilities, since the input provided by the designer is so detailed that only
a limited form of reasoning is necessary. Similar to ETL systems are mashup edi-
tors [ Heinzl et al. 2009 ] that try to facilitate the mashup designer. The operational
goals of mashup editors are similar to those of ETL systems, so we do not consider
them as a separate category.
We use the term matching or mapping scenario to refer to a particular instance
of the matching or mapping problem, respectively. A scenario is represented by
the input provided to the matching or mapping tool. More specifically, a matching
scenario is a pair of source and target schema. A mapping scenario is a pair of source
and target schema alongside a specification of the intented mappings. A solution to
a scenario is a set of matches, respectively mappings, that satisfy the specifications
set by the scenario.
3
Challenges in Matching and Mapping System Evaluation
A fundamental requirement for providing universal evaluation of matching and
mapping tools is the existence of benchmarks. A benchmark for a computer appli-
cation or tool is based on the idea of evaluation scenarios , i.e., a standardized
set of problems or tests serving as a basis for comparison. 1 An evaluation sce-
nario for a matching/mapping tool is a scenario alongside the expected output
of the tool, i.e., the expected solution. Unfortunately, and unlike benchmarks for
relational database management tools, such as, TPC-H [ Transaction Processing
Performance Council 2001 ], or for XML query engines, such as, XMach [ Bohme
and Rahm 2001 ], X007 [ Bressan et al. 2001 ], MBench [ Runapongsa et al. 2002 ],
XMark [ Schmidt et al. 2002 ], and XBench [ Yao et al. 2004 ], the design of a bench-
mark for matching/mapping tools is fundamentally different and significantly more
challenging [ Okawara et al. 2006 ], mainly due to the different nature, goals, and
operational principles of the tool.
One of the differences is the fact that given a source and a target schema, there is
not always one “correct” set of matches or mappings. In query engines [ Transaction
Processing Performance Council 2001 ; Bohme and Rahm 2001 ], the correct answer
to a given query is uniquely specified by the semantics of the query language. In
matching/mapping tools, on the other hand, the expected answer depends not only
on the semantics of the schemas, which by nature may be ambiguous, but also on the
transformation that the mapping designer was intending to make. The situation rem-
inisces the case of Web search engines, where there are many documents returned
as an answer to a given keyword query, others more and others less related to the
query, but which document is actually the correct answer can only be decided by the
1 Source: Merriam Webster dictionary.
Search WWH ::




Custom Search