Databases Reference
In-Depth Information
Merging ontologies: An ontology describes the concepts in a domain and the rela-
tionships between those concepts [ Fikes 1996 ]. Ontologies are a commonplace in
varied domains such as anatomy and civil engineering. Often a domain has more
than one “standard” ontology for the same general concepts. For example, the
foundational model of anatomy (FMA) [ Rosse et al. 1998 ] is designed to model
anatomy in great detail, whereas the Galen Common Reference Model [ Rec-
tor et al. 1994 ] is designed to model anatomy for clinical applications. Because
these two ontologies serve different communities, they have different concepts
even though the domain is roughly the same. Merging the two ontologies would
allow users to understand how all the concepts are related.
All of these applications have the same problem: given the two or more struc-
tured representations of data - which we often refer to as models [ Bernstein et al.
2000 ] - combine the models to form one unified representation. These applications
may also seek to create the mappings between the unified version and the input
smaller schemas/ontologies. Many different works have looked at these different
problems, both alone and in consort. This paper surveys some of the works in this
area. In particular, Sect. 2 begins by describing a number of theoretical works that
are relevant for multiple merging situations. Section 3 looks at works on view inte-
gration. Section 4 looks at work on data integration. Section 5 looks at work on
merging ontologies. Section 6 looks at generic approaches for merging structured
data representations. Section 7 surveys work on a variation of the problem: the data
to be merged has been modified from a common ancestor and now the changes
must be incorporated together. This variation is common both in file systems and
in computer supportive collaborative work. Section 8 discusses commonalities and
differences. Finally, Sect. 9 concludes.
Throughout this paper, we assume that the relationships between the schemas
have already been created; this is beyond the scope of the paper. Interested readers
in creating mappings are referred to existing surveys [ Rahm and Bernstein 2001 ;
Doan and Halevy 2004 ].
2
Theoretical Underpinnings
2.1
Information Capacity
The key notion of information capacity [ Hull 1984 ] is that when comparing two
schemas E and G , one can consider how much of the data in E can be accessed
using G and vice versa.
Miller et al. [ 1993 ] study which properties of information capacity are required
for both data integration and view integration. The key to understanding the require-
ments is the definitions of equivalence and dominance:
To be information capacity preserving, a mapping I(S1)
I(S2) must be defined
on every element in S1 and functional in both directions. If so, then S2 dominates
S1 , denoted S1
!
S2 .If S1
S2 and S2
S1 ,then S1 and S2 are equivalent,
Search WWH ::




Custom Search