Modeling Uncertain Schema Matching - Uncertain Schema Matching - page 11

Databases Reference

In-Depth Information

3.1.2 ATTRIBUTE CORRESPONDENCES AND THE SIMILARITY MATRIX

Let S and S be schemata with n and n attributes, respectively. 1

S = S × S

Let

be the set of all

possible attribute correspondences between S and S .

S

is a set of attribute pairs ( e.g. ,( arrivalDate ,

checkInDay )). Let M S,S be an n × n similarity matrix over

S

, where M i,j represents a degree of

similarity between the i -th attribute of S and the j -th attribute of S . The majority of works in the

schema matching literature define M i,j

to be a real number in ( 0 , 1 ) . M S,S is a binary similarity

n , M i,j

matrix if for all 1

≤

i

≤

n and 1

≤

j

≤

∈ {

0 , 1

}

. That is, a binary similarity matrix accepts

only 0 and 1 as possible values.

Table 3.2: A Similarity Matrix Example

S 1 −→

1 cardNum 2 city

3 arrivalDate

4 departureDate

↓ S 2

1 clientNum

0 . 843

0 . 323

0 . 317

0 . 302

0 . 290

1 . 000

0 . 326

0 . 303

2 city

3 checkInDay

0 . 344

0 . 328

0 . 351

0 . 352

0 . 312

0 . 310

0 . 359

0 . 356

4 checkOutDay

Table 3.3: A Binary Similarity Matrix Example

S 1 −→

1 cardNum 2 city

3 arrivalDate

4 departureDate

↓ S 2

1 clientNum

1

0

0

0

0

1

0

0

2 city

3 checkInDay

0

0

0

1

0

0

1

0

4 checkOutDay

Example 3.2 Consider tables 3.2 and 3.3 , representing simplified similarity matrices of the running

case study. The similarity matrix in Table 3.2 is a simplified version of the matching between two

schemata of Example 3.1 . The similarity matrix in Table 3.3 is a binary similarity matrix. Matrix

elements are given using both attribute names and numbers.

Similarity matrices are generated by schema matchers. Schema matchers are instantiations of

the schema matching process. They differ mainly in the measures of similarity they employ, which

yield different similarity matrices. These measures can be arbitrarily complex, and may use var-

ious techniques. Some matchers assume similar attributes are more likely to have similar names

[ He and Chang , 2003 , Su et al. , 2006 ]. Other matchers assume similar attributes share similar do-

mains [ Gal et al. , 2005b , Madhavan et al. , 2001 ]. Others yet take instance similarity as an indication

1 For ease of exposition, we constrain our presentation to a matching process involving two schemata. Extensions to holistic schema

matching are discussed in Section 3.2 .

Next Page

Uncertain Schema Matching

Search WWH ::

Custom Search

Home