Towards Large-Scale Schema and Ontology Matching - Schema Matching and Mapping

Databases Reference

In-Depth Information

For interactive schema matching, the user may interact with the system and the

match workflow in different ways (not shown in Fig. 1.1 ), preferably via a user-

friendly GUI. She typically has to specify the workflow configuration, e.g., which

matchers should be executed and which strategy/parameters should be applied for

the final combination and selection steps. The final results are typically only sug-

gested correspondences that the user can confirm or correct. The match workflow

itself could be executed on the whole input schemas or incrementally for selected

schema parts or even individual elements ( Bernstein et al. 2006 ). The latter approach

is a simple but reasonable way to better deal with large schemas as it reduces the

performance requirements compared to matching the whole schemas. Furthermore,

the determined correspondences can better be visualized avoiding that the user is

overwhelmed with huge mappings. Shi et al. ( 2009 ) propose an interesting variation

for interactive matching where the system asks the user for feedback on specific

correspondences that are hard to determine automatically and that are valuable as

input for further matcher executions.

2.2

Instance-Based and Usage-Based Matching

2.2.1

Instance-Based Ontology Matching

Instance-based ontology matching determines the similarity between ontology con-

cepts from the similarity of instances associated to the concepts. For example,

two categories of a product catalog can be considered as equivalent if their prod-

ucts are largely the same or at least highly similar. One can argue that instances

can characterize the semantics of schema elements or ontology concepts very well

and potentially better than a concept name or comment. Therefore, instance-based

matching holds the promise of identifying high-quality correspondences. On the

other hand, obtaining sufficient and suitable instance data for all ontologies and all

ontology concepts to be matched is a major problem, especially for large ontologies.

Hence, we consider instance-based approaches primarily as a complementary, albeit

significant, match approach to be used in addition to metadata-based matchers.

As indicated in Fig. 1.2 , two main cases for instance-based ontology match-

ing can be distinguished depending on whether or not the existence of common

instances is assumed. The existence of the same instances for different ontologies

(e.g., the same web pages categorized in different web directories, the same prod-

ucts offered in different product catalogs, or the same set of proteins described in

different life science ontologies) simplifies the determination of similar concept. In

this case, two concepts may be considered as equivalent when their instances over-

lap significantly. Different set similarity measures can be used to measure such an

instance overlap, e.g., based on Dice, Jaccard, or cosine similarity. The instance

overlap approach has been used to match large life science ontologies ( Kirsten et al.

2007 ) and product catalogs ( Thor et al. 2007 ).

Search WWH ::

Custom Search

Home