Database Reference
In-Depth Information
represented in XPath in terms of the ontology.
Data transformation consists in converting data
in terms of the ontology and in the same format.
Both tasks are performed through XML queries
associated to views of the sources automatically
built beforehand. Through data integration, we
addressed the reference reconciliation problem and
presented a combination of a logical and numerical
approach. Both approaches exploit schema and
data knowledge given in a declarative way by a set
of constraints and are then generic. The relations
between references are exploited either by L2R for
propagating (non) reconciliation decisions through
logical rules or by N2R for propagating similarity
scores thanks to the resolution of the equation
system. The two methods are unsupervised be-
cause no labeled data set is used. Furthermore, the
combined approach is able to capitalize its experi-
ence by saving inferred (non) synonymies. The
results that are obtained by the logical method are
sure. This distinguishes L2R from other existing
works. The numerical method complements the
results of the logical one. It exploits the schema
and data knowledge and expresses the similarity
computation in a non linear equation system. The
experiments show promising results for recall, and
most importantly its significant increasing when
constraints are added.
Berglund, A., Boag, S., Chamberlin, D., Fernan-
dez, M. F., Kay, M., Robie, J., & Simeon, J. (2007).
XML path language (XPath) 2.0 . Retrieved from
http://www.w3.org/TR/xpath20/
Bhattacharya, I., & Getoor, L. (2006). Entity
resolution in graphs. In L. B. Holder & D. J.
Cook (Eds.), Mining graph data . New York: John
Wiley & Sons.
Bilke, A., & Naumann, F. (2005). Schema
matching using duplicates. In Proceedings of the
International Conference on Data Engineering
(pp. 69-80).
Boag, S., Chamberlin, D., Fernandez, M. F.,
Florescu, D., Robie, J., & Simeon, J. (2007).
XQuery 1.0: An XML query language (W3C
Recommendation). Retrieved from http://www.
w3.org/TR/xquery/
Chang, C., & Lee, R. C. (1997). Symbolic logic
and mechanical theorem proving . New York:
Academic Press.
Cluet, S., Delobel, C., Simeon, J., & Smaga, K.
(1998). Your mediators need data conversion! In
Proceedings of the SIGMOD'98 , Seattle, USA
(pp. 177-188).
Cohen, W. W. (2000). Data integration using
similarity joins and a word-based informa-
tion representation language. ACM Transac-
tions on Information Systems , 18 (3), 288-321.
doi:10.1145/352595.352598
REFERENCES
Abiteboul, S., Cluet, S., & Milo, T. (1997). Corre-
spondence and translation for heterogeneous data.
In Proceedings of the International Conference
on DataBase Theory (pp. 351-363).
Cohen, W. W., Ravikumar, P., & Fienberg, S. E.
(2003). A comparison of string distance metrics
for name-matching tasks. In Proceedings of the
Workshop on Information Integration on the Web
(pp. 73-78).
Baxter, R., Christen, P., & Churches, T. (2003). A
comparison of fast blocking methods for record
linkage. In Proceedings of the ACM SIGKDD'03
Workshop on Data Cleaning Record Linkage and
Object Consolidation , Washington, DC, USA
(pp. 25-27).
Dey, D., Sarkar, S., & De, P. (1998). Entity match-
ing in heterogeneous databases: A distance based
decision model. In Proceedings of the Thirty-First
Hawaii International Conference on System
Sciences (pp. 305-313). Washington, DC: IEEE
Computer Society.
Search WWH ::




Custom Search