Database Reference
In-Depth Information
BACKGROUND
Database Integration Process
Database integration, involving both schemas and instances of databases, should be performed in data-
base migration/ consolidation, data warehouse, and multidatabase systems. Regardless of the mode of
integration, the basic database integration tasks are essentially the same. We view the entire database
integration as a set of processes which derives the integrated schema and instances that can be imple-
mented on either multidatabase or data warehouse systems.
Logical steps in which the integrated database is derived from the existing (local) databases does not
dictate exactly how and when the steps should be performed. For example, for the actual consolidation
of databases, schema integration and instance integration should be performed together. However, if only
a virtual integration is required, schema integration will be performed once but the instance integration
will be performed whenever queries are evaluated against the integrated database. The actual schema and
instance integration techniques adopted will depend on a number of factors such as the global applica-
tions' requirements, types of conflicts found among local databases and data quality of local databases.
Schema Integration Process
Each local database consists of a schema and a set of data instances. The schema integration process
requires knowledge about the local database schemas. The knowledge about database schema can be
discovered from the database content. For example, database reverse engineering extracts applications'
domain knowledge by analyzing not only the database schema but also database instances of an exist-
ing database (Chiang, Barron and Storey, 1994). However, we always require the database designers or
administrators to supply additional knowledge manually. Schema integration produces the global schema
as well as the mappings between the global schema elements and the local schema elements. Very often, a
local schema can be vastly different from the global schema. This can be caused by different data models
or database design decisions adopted by local databases and the integrated database. We may therefore
have to introduce a view of the local schema, called export schema, such that the local database through
the export schema can be seen compatible with the global schema. An export schema also defines the
portion or subset of a local database to be integrated. The local database to export database conversion
usually involves schema transformation. Efforts in this area are reported in (Fahrner and Vossen 1995;
Meier A. et al, 1994; Zaniolo, 1979)
Instance Integration Process
During instance integration process the entity identification always precedes attribute value conflict
resolution since only the conflicting attribute values of matching data instances should be resolved.
Throughout the entire instance integration, any detected erroneous integration result (e.g. two data in-
stances from the same existing databases is matched to one single data instance from another database)
is forwarded to the schema integration process as a feedback if the error is possibly caused by incorrect
schema integration. This can happen when the schema integration makes use of hypothesis obtained by
sampling the local databases. However, this hypothesis may not hold for all local database instances.
Search WWH ::




Custom Search