Tuning for Schema Matching - Schema Matching and Mapping

Databases Reference

In-Depth Information

A simple example of aggregation function is demonstrated with BMatch

[ Duchateau et al. 2008b ] or Cupid [ Madhavan et al. 2001 ]. Their authors aggre-

gate the results of terminological measure with the ones computed by a structural

measure by varying the weights applied to each measure ( 2

and 2

, 3

and 3

,etc.).

In most tools, default values are given to these weights. They are mainly the

results of intensive experiments. For example, the default weights of COMA

's

name and data type similarity measures are 0:7 and 0:3, respectively [ Do and Rahm

2002 ]. As explained in Glue [ Doan et al. 2003 ] or APFEL [ Ehrig et al. 2005 ], it

is possible to tune the weights of an aggregation function automatically, thanks to

machine learning techniques.

To help tuning the weights in aggregating functions, we discuss the iMAP

approach [ Dhamankar et al. 2004 ]. This matcher mainly provides a new set of

machine learning-based measures for discovering specific types of complex map-

pings (e.g., name is a concatenation of firstname and lastname ). It also includes an

explanation module to clarify why a given correspondence has been discovered to

the detriment of another candidate. For instance, this module is able to describe that

a string-matching classifier has a strong influence for a discovered correspondence.

Thus, user can use this feedback to decrease the weight of this classifier.

CC

5.3

Supporting Users to Revise Strategies

Although most matchers simply provide a graphical user interface to visualize the

results, recent works have pointed out a need for selecting the best strategy. For

instance, including some mechanisms to easily update the weights of a function so

that users can directly analyse impacts of these changes.

Here, we describe recent works that aim at supporting users during the tasks of

selecting appropriate similarity measures and combining them. To combine them

efficiently, weights have to be efficiently tuned. To support users during these tasks,

two tools have been designed: AgreementMaker and Harmony. Whatever the tech-

nique they use (interactions with users or strategy filters), they enable a revision of

the current strategy by adding, removing or modifying parameters and similarity

measures involved in the combination. We further describe each of these tools in the

rest of this part.

5.3.1

AgreementMaker

The originality of AgreementMaker [ Cruz et al. 2007 , 2009 ] is the capability of

matching methods combination. Moreover, it provides facilities for tuning manually

the quality of matches. Indeed, one of the interesting features of AgreementMaker is

a comprehensive user interface supporting both advanced visualization techniques

and a control panel that drives the matching methods. This interface, depicted by

Fig. 10.5 , provides the user facilities to evaluate the matching process, thus enabling

the user to be directly involved in the loop and evaluation strategies.

Schema Matching and Mapping

Search WWH ::

Custom Search

Home