Databases Reference
In-Depth Information
The input to the process of schema matching is given by two schemata S and S and a
constraint boolean function . The output of the schema matching process is a schema matching
σ
. To illustrate, consider the examples in tables 3.2 and 3.3 . The similarity matrix in Table 3.2
represents a step in the schema matching process, in which the similarity of attribute correspondences
is recorded in a similarity matrix. Table 3.3 presents a possible outcome of the matching process,
where all attribute correspondences, for which a value of 1 is assigned, are part of the resulting
schema matching. The constraint function that is applied in this example enforces a 1 : 1 matching.
By now, it has become apparent that we conceive the matrix abstraction to be a suitable model
of uncertain schema matching. Therefore, we take our approach one step further and provide a
formalization of the matching process output using similarity matrices.
Let M S,S be an n × n similarity matrix over
S
Definition 3.4
Matrix Satisfaction
. A schema
M S,S )if A i ,A j
is said to satisfy M S,S (denoted σ
matching σ
|=
σ
M i,j > 0.
is said to maximally satisfy M S,S if σ
M S,S and for each σ such that
σ
|=
M S,S , σ σ .
The output of a schema matching is therefore a similarity matrix M S,S . An attribute
σ |=
pair A i ,A j can be considered an attribute correspondence in the output schema matching only
M i ,j > 0. A schema matching σ satisfies M if the above is true for any attribute pair in σ .
Finally, we define the output of the schema matching process to be a valid schema matching that
maximally satisfies M S,S .
Schema matching satisfaction, just like , partitions into two sets. In the case of the
former, the partitioning is based on the matching requirements, while in the latter, it is based on the
application and the matcher's ability to determine attribute requirements. Therefore, the partitioning
induced by does not necessarily overlap with that induced by the satisfaction criterion and may, at
times, be at odds with it. It is easy to see that in the absence of a constraint function , i.e. , whenever
= or whenever is ignored by the matcher, there is exactly one schema matching that
maximally satisfies M . This schema matching is that which contains all attribute correspondences
A i ,A j for which M i,j > 0. However, when is both meaningful and used by the matcher, there
may be more than a single valid schema matching that maximally satisfies M . For example, if a 1
1
constraint is enforced, there may be several schema matchings that neither contain nor are contained
by others that satisfy the condition of maximal satisfaction. Similarly, if none of the valid schema
matchings satisfy M then, clearly, the matcher yields no schema matching as an outcome of the
matching process.
:
3.1.4 (YET ANOTHER) SCHEMA MATCHER CLASSIFICATION
At first glance, the idea of defining matrix satisfaction seems odd. We have shown in Section 3.1.2 that
in most cases schema matchers will not decisively assign a value of 0 to any attribute pair matching.
Taking Definition 3.4 at its face value is likely to result in any attribute pair being declared an attribute
Search WWH ::




Custom Search