Databases Reference
In-Depth Information
Chapter 4
Uncertainty in Data Integration and Dataspace
Support Platforms
Anish Das Sarma, Xin Luna Dong, and Alon Y. Halevy
Abstract Data integration has been an important area of research for several years.
However, such systems suffer from one of the main drawbacks of database systems:
the need to invest significant modeling effort upfront. Dataspace support platforms
(DSSP) envision a system that offers useful services on its data without any setup
effort and that improves with time in a pay-as-you-go fashion. We argue that to
support DSSPs, the system needs to model uncertainty at its core. We describe the
concepts of probabilistic mediated schemas and probabilistic mappings as enabling
concepts for DSSPs.
1
Introduction
Data integration and exchange systems offer a uniform interface to a multitude of
data sources and the ability to share data across multiple systems. These systems
have recently enjoyed significant research and commercial success ( Halevy et al.
2005 , 2006b ). Current data integration systems are essentially a natural extension
of traditional database systems in that queries are specified in a structured form
and data are modeled in one of the traditional data models (relational, XML). In
addition, the data integration system has exact knowledge of how the data in the
sources map to the schema are used by the data integration system.
B
A.D. Sarma (
)
Yahoo! Research, 2-GA 2231, Santa Clara, CA 95051, USA
e-mail: anish@yahoo-inc.com
X.L. Dong
AT&T Labs - Research, 180 Park Ave., Florham Park, NJ 07932, USA
e-mail: lunadong@research.att.com
A.Y. Halevy
Google Inc., 1600 Amphitheatre Blvd, Mountain View, CA 94043, USA
e-mail: halevy@google.com
 
Search WWH ::




Custom Search