Databases Reference
In-Depth Information
a
b
With a Decision Tree
With Nearest Neigbour Generalized Examples
Fig. 10.6
YAM: Examples of combination of similarity measures
schema elements. On the contrary, NNge classifier builds groups of nearest neigh-
bour pairs of schema elements and then finds the best rule, expressed by boolean
logic, for each group. YAM currently includes 20 classifiers from the Weka library
[ Garner 1995 ]. According to [ Duchateau 2009 ], experiments show that the tuned
matchers produced by YAM are able to improve F-measure by 20% over traditional
approaches. Datasets mainly include average schemas from various domains, but
also two datasets involving large schemas. Similar to most machine learning-based
approaches, authors have noticed the fact that the results may vary according to
training data, hence the need to perform different runs during experiments.
5.5
Conclusion
In this section, we have described the different strategies to combine similarity mea-
sures and to tune them, mainly their weights . Fortunately, there exist several tools to
help users revising or selecting the strategies. Visual tools support users for manu-
ally configuring these strategies, mainly thanks to state-of-the-art GUI. Finally, we
have explored automatic approaches that are able to discover and tune the best strat-
egy. In the following section, we are still heading one level higher. Indeed, the first
choice of a user deals with the matching tool.
6
Matcher Selection
The selection of a schema matcher is obviously not a parameter: it does not fit with
the definitions provided in Sect. 2 . But this is likely meta-tuning since one first needs
to choose a schema matcher before tuning its parameters and using it. Furthermore,
Search WWH ::




Custom Search