Mapping-Based Merging of Schemas - Schema Matching and Mapping

Databases Reference

In-Depth Information

Merging ontologies: An ontology describes the concepts in a domain and the rela-

tionships between those concepts [ Fikes 1996 ]. Ontologies are a commonplace in

varied domains such as anatomy and civil engineering. Often a domain has more

than one “standard” ontology for the same general concepts. For example, the

foundational model of anatomy (FMA) [ Rosse et al. 1998 ] is designed to model

anatomy in great detail, whereas the Galen Common Reference Model [ Rec-

tor et al. 1994 ] is designed to model anatomy for clinical applications. Because

these two ontologies serve different communities, they have different concepts

even though the domain is roughly the same. Merging the two ontologies would

allow users to understand how all the concepts are related.

All of these applications have the same problem: given the two or more struc-

tured representations of data - which we often refer to as models [ Bernstein et al.

2000 ] - combine the models to form one unified representation. These applications

may also seek to create the mappings between the unified version and the input

smaller schemas/ontologies. Many different works have looked at these different

problems, both alone and in consort. This paper surveys some of the works in this

area. In particular, Sect. 2 begins by describing a number of theoretical works that

are relevant for multiple merging situations. Section 3 looks at works on view inte-

gration. Section 4 looks at work on data integration. Section 5 looks at work on

merging ontologies. Section 6 looks at generic approaches for merging structured

data representations. Section 7 surveys work on a variation of the problem: the data

to be merged has been modified from a common ancestor and now the changes

must be incorporated together. This variation is common both in file systems and

in computer supportive collaborative work. Section 8 discusses commonalities and

differences. Finally, Sect. 9 concludes.

Throughout this paper, we assume that the relationships between the schemas

have already been created; this is beyond the scope of the paper. Interested readers

in creating mappings are referred to existing surveys [ Rahm and Bernstein 2001 ;

Doan and Halevy 2004 ].

2

Theoretical Underpinnings

2.1

Information Capacity

The key notion of information capacity [ Hull 1984 ] is that when comparing two

schemas E and G , one can consider how much of the data in E can be accessed

using G and vice versa.

Miller et al. [ 1993 ] study which properties of information capacity are required

for both data integration and view integration. The key to understanding the require-

ments is the definitions of equivalence and dominance:

To be information capacity preserving, a mapping I(S1)

I(S2) must be defined

on every element in S1 and functional in both directions. If so, then S2 dominates

S1 , denoted S1

!

S2 .If S1

S2 and S2

S1 ,then S1 and S2 are equivalent,

Schema Matching and Mapping

Search WWH ::

Custom Search

Home