Modeling Uncertain Schema Matching - Uncertain Schema Matching

Databases Reference

In-Depth Information

The input to the process of schema matching is given by two schemata S and S and a

constraint boolean function . The output of the schema matching process is a schema matching

. To illustrate, consider the examples in tables 3.2 and 3.3 . The similarity matrix in Table 3.2

represents a step in the schema matching process, in which the similarity of attribute correspondences

is recorded in a similarity matrix. Table 3.3 presents a possible outcome of the matching process,

where all attribute correspondences, for which a value of 1 is assigned, are part of the resulting

schema matching. The constraint function that is applied in this example enforces a 1 : 1 matching.

By now, it has become apparent that we conceive the matrix abstraction to be a suitable model

of uncertain schema matching. Therefore, we take our approach one step further and provide a

formalization of the matching process output using similarity matrices.

∈

Let M S,S be an n × n similarity matrix over

Definition 3.4

Matrix Satisfaction

. A schema

M S,S )if A i ,A j

is said to satisfy M S,S (denoted σ

matching σ

∈

→

M i,j > 0.

∈ is said to maximally satisfy M S,S if σ

M S,S and for each σ ∈ such that

M S,S , σ ⊂ σ .

The output of a schema matching is therefore a similarity matrix M S,S . An attribute

σ |=

pair A i ,A j can be considered an attribute correspondence in the output schema matching only

M i ,j > 0. A schema matching σ satisfies M if the above is true for any attribute pair in σ .

Finally, we define the output of the schema matching process to be a valid schema matching that

maximally satisfies M S,S .

Schema matching satisfaction, just like , partitions into two sets. In the case of the

former, the partitioning is based on the matching requirements, while in the latter, it is based on the

application and the matcher's ability to determine attribute requirements. Therefore, the partitioning

induced by does not necessarily overlap with that induced by the satisfaction criterion and may, at

times, be at odds with it. It is easy to see that in the absence of a constraint function , i.e. , whenever

= or whenever is ignored by the matcher, there is exactly one schema matching that

maximally satisfies M . This schema matching is that which contains all attribute correspondences

A i ,A j for which M i,j > 0. However, when is both meaningful and used by the matcher, there

may be more than a single valid schema matching that maximally satisfies M . For example, if a 1

constraint is enforced, there may be several schema matchings that neither contain nor are contained

by others that satisfy the condition of maximal satisfaction. Similarly, if none of the valid schema

matchings satisfy M then, clearly, the matcher yields no schema matching as an outcome of the

matching process.

3.1.4 (YET ANOTHER) SCHEMA MATCHER CLASSIFICATION

At first glance, the idea of defining matrix satisfaction seems odd. We have shown in Section 3.1.2 that

in most cases schema matchers will not decisively assign a value of 0 to any attribute pair matching.

Taking Definition 3.4 at its face value is likely to result in any attribute pair being declared an attribute

Uncertain Schema Matching

Search WWH ::

Custom Search

Home