Databases Reference
In-Depth Information
Comparing examples 3.7 and 3.8 raises interesting observations. First, the modeling of second-
line matchers can serve as a reference framework for comparing various research efforts in schema
matching. For example, while combiners and match selectors are defined to be separate types by
Leeetal. [ 2007 ], they were combined and redefined by Gal [ 2006 ]. A second observation involves
the goal of second-line matchers. Second-line matchers aim at improving the outcomes of first-line
schema matchers, increasing their robustness. This idea is appealing since complementary matchers
can potentially compensate for each other's weaknesses [ Bernstein et al. , 2004 ]. Gal [ 2006 ] has
shown that the use of a heuristic, based on top- K best schema matchings, has increased the precision
of mappings by 25% on average, at the cost of a minor 8% reduction in recall.
Table 3.4: Two dimension matcher classification
Matcher
First-Line Matcher
Second-Line Matcher
Non-decisive
Term
Combined
Decision maker
MWBG
We now propose yet another classification of matchers on two orthogonal dimensions (see
Table 3.4 for classification and example matchers). The first dimension separates first- from second-
line schema matchers. The second dimension separates those matchers that aim at specifying schema
matchings, dubbed decision makers , from those that compute similarity values yet do not make
decisions at the schema level. Using Definition 3.5 , we can say that a matcher is decisive if it satisfies
. The most common type is a non-decisive first-line matcher. The OntoBuilder's Term matcher
belongs to this class, as does a WordNet-based decision tree technique proposed by Embley et al.
[ 2002 ]. Combiners, in COMA's terminology, are non-decisive second-line schema matchers. They
combine similarity matrices of other matchers, and hence they are second-line matchers by definition.
However, their similarity matrix is not meant to be used to decide on a single schema matching.
Well-known decisive second-line matchers are algorithms like MWBG and SM . Both algorithms
fall into the category of constraint enforcers as described by Leeetal. [ 2007 ], and both enforce a
cardinality constraint of 1 : 1. Finally, the class of first-line decision makers contains few if any
matchers. The main reason for this is that most systems abide by the long conceptual modeling
tradition of database schema integration, as summarized by Batini et al. [ 1986 ]: “The comparison
activity focuses on primitive objects first...; then it deals with those modeling constructs that represent
associations among primitive objects.” This dichotomy has in the main been preserved in schema
matching as well.
As a concluding remark, we compare the proposed classification with the classifications of
Rahm and Bernstein [ 2001 ] and Euzenat and Shvaiko [ 2007 ]. Rahm and Bernstein [ 2001 ] parti-
tion matchers into individual matchers and combining matchers . The latter class contains only second-
line schema matchers. Individual matchers can also serve as second-line matchers. For example, a
matcher that takes the outcome of another matcher and applies a threshold condition on it is an
individual , second-line matcher. Combining matchers are further partitioned into composite and
hybrid matchers, a classification that is less relevant in our classification system, where the sec-
 
Search WWH ::




Custom Search