Enhancing the Capabilities of Attribute Correspondences - Schema Matching and Mapping

Databases Reference

In-Depth Information

on the x-axis represents a class of schema matchings with a different precision. The

z -axis represents the similarity measure. Finally, the y-axis stands for the number of

schema matchings from a given precision class and with a given similarity measure.

Two main insights are available from Fig. 3.2 . First, the similarity measures of

matchings within each schema matching class form a “bell” shape, centered around

a specific similarity measure. Such a behavior indicates a certain level of robust-

ness of a schema matcher, assigning close similarity measures to matchings within

each class. Second, the “tails” of the bell shapes overlap. Therefore, a schema

matching from a class of a lower precision may receive a higher similarity mea-

sure than a matching from a class of a higher precision. This, of course, contradicts

the monotonicity definition. However, the first observation serves as a motivation

for a definition of a statistical monotonicity, first introduced in Galetal. [ 2005a ], as

follows:

Let ˙

Df 1 ; 2 ;:::; m g

be a set of matchings over schemata S 1 and S 2 with n 1

and n 2

attributes, respectively, and define n D

max.n 1 ;n 2 /.Let˙ 1 ;˙ 2 ;:::;˙ n C 1

iff i 1

n

p./ < n

be subsets of ˙ such that for all 1

.

We d e fi n e M i to be a random variable, representing the similarity measure of a

randomly chosen matching from ˙ i . ˙ is statistically monotonic if the following

inequality holds for any 1 i<j n C 1:

i

n C 1;

2

˙ i

˝.M i /<

˝.M j /;

where ˝.M/ stands for the expected value of M .

Intuitively, a schema matching algorithm is statistically monotonic with respect

to given two schemata if the expected certainty level increases with precision.

Statistical monotonicity can assist us in explaining certain phenomena in schema

matching (e.g., why schema matcher ensembles work well [ Gal and Sagi 2010 ])

and also to serve as a guideline in finding better ways to use schema matching

algorithms.

3

Contextual Attribute Correspondences

Attribute correspondences may hold under certain instance conditions. With con-

textual attribute correspondences, selection conditions are associated with attribute

correspondences. Therefore, a contextual attribute correspondence is a triplet of

the form .A i ;A j ;c/,whereA i

and A j

are attributes and c is a condition whose

structure is defined in Sect. 3.1 .

Example 2. With contextual attribute correspondences, we could state that R.Card-

Info.cardNum is the same as S.HotelCardInfo.clientNum if R.CardInfo.type

is assigned with the value RoomsRUs . For all other type values, R.Card-

Info.cardNum is the same as S.CardInfo.cardNum . These contextual attribute

correspondences are given as follows.

Schema Matching and Mapping

Search WWH ::

Custom Search

Home