Databases Reference
In-Depth Information
. R : CardInfo : cardNum ; S : HotelCardInfo : clientNum ; R : CardInfo : type
' RoomsRUs '/
. R : CardInfo : cardNum ; S : CardInfo : cardNum ; R : CardInfo : type ¤ ' RoomsRUs '/
D
Contextual attribute correspondences are useful in overcoming various aspects
of structural heterogeneity. A typical example of such heterogeneity involves
designer's decision regarding the interpretation of subtypes. In the example above,
database R was designed to include all credit card subtypes in a single relation,
with type as a differentiating value. Database S refines this decision by allocating a
separate relation to one of the subtypes.
In Bohannon et al. [ 2006 ], a selection condition is defined as a logical condi-
tion, with the added benefit of serving as a basis for the schema mapping process
[ Barbosa et al. 2005 ; Bohannon et al. 2005 ; Fagin 2006 ; Fagin et al. 2007 ].
At the basis of contextual attribute correspondences is the use of instance values
as a differentiator between possible correspondences. Therefore, the ability of iden-
tifying contextual attribute correspondences depends on the ability of a matcher to
take into account instance values. For example, the Te r m matching technique, given
earlier as an example, will not change its estimation of the amount of similarity of
two attributes based on context. Instance values are used in many of the methods
that apply machine learning techniques to schema matching. Autoplex [ Berlin and
Motro 2001 ], LSD [ Doan et al. 2001 ], and iMAP [ Dhamankar et al. 2004 ]usea
naıve Bayes classifier to learn attribute correspondence probabilities using instance
training set. Also, sPLMap [ Nottelmann and Straccia 2007 ]usenaıve Bayes, kNN,
and KL-distance as content-based classifiers.
3.1
Modeling Contextual Attribute Correspondences
Contextual attribute correspondences are specified in terms of a condition on the
value assignments of attributes. A k -context of an attribute correspondence is a
condition that involves k database attributes. For k D 0, a contextual attribute corre-
spondence becomes a common attribute correspondence. For k D 1, the condition is
simple, of the form a D v,wherea is an attribute and v is a constant in a's domain.
For example, R.CardInfo.type='RoomsRUs' . Disjunctive, conjunctive, and gen-
eral k-contexts generalize simple conditions in the usual way. For example, simple
disjunctive k -context for k D 1 is a condition of the form a 2f v 1 ;v 2 ;:::;v k g
.
Contextual attribute correspondences can be modeled with similarity matrices.
An entry in the similarity matrix M i;j
,where
v 2 Œ0; 1 is a similarity value and c is a context as defined above. This model-
ing allows a smooth extension of contextual attribute correspondences to matcher
ensembles [ Domshlak et al. 2007 ; He and Chang 2005 ], in which matchers are
combined to improve the quality of the outcome of the matching process. For exam-
ple, Do et al. [ 2002 ]and Domshlak et al. [ 2007 ] proposed several ways to combine
similarity matrices, generated by different matchers, into a single matrix. Such com-
bination, which was based solely on aggregating similarity scores, can be extended
is extended to be a tuple
h v; c i
Search WWH ::




Custom Search