Databases Reference
In-Depth Information
2.
Heterogeneity: This means the system can accommodate different hardware,
network protocols, data models, query languages, and query capabilities. They
might be as similar as two versions of Oracle or SQL Server, or as diverse as
relational databases, websites running XML, or special applications with other
types of databases.
3.
Autonomy: There is an absence of restrictions being enforced at the remote data
source, thus allowing it to remain autonomous. It is highly desirable to have the
federated database system not change the local operation of an existing data
source.
4.
High degree of function: A federated database system should allow applications
to exploit not only the high degree of function provided by the federated sys-
tem, but also the special functions unique to the variety of individual data
sources. Typical federated systems run on SQL to make it easy to use relative to
the individual local systems.
5.
Extensibility and openness: Federated systems need to be able to evolve over time,
and thus need the flexibility to seamlessly add new data sources to the enter-
prise. A wrapper module is used to provide the data access logic for each data
source. In fact, it is common to supply wrappers for a set of known data sources
like Oracle, Sybase, and XML files, plus some generic ones like Open Database
Connectivity (ODBC). The IBM WebSphere Federation Server provides a
wrapper development kit so customers can write their own wrappers to their
own proprietary data sources that cannot be accessed by the native wrappers.
6.
Optimized performance: The query optimizer of a relational database system is
the component that determines the most efficient way to answer a given query.
In a federated system the optimizer must also determine whether the different
operations in a query (join, select, union, etc.) should be done by the federated
server or by the local system at each data source. To do this, the optimizer not
only needs to have a cost model for each data source as well as the overall net-
work, but also it tries to figure out whether the query semantics are identical if a
query operation is pushed down (in a query rewrite) versus whether the opera-
tion is performed locally. The latter decision is based on information specific to
the data source. Once an operation is identified to be remotely executable, then
the optimizer can use cost-based optimization to determine whether pushing it
down is the right decision.
None of these objectives explicitly depends on a data allocation strategy. In feder-
ated systems, data allocation (or data distribution) is done largely at the discretion of the
database designer or database administrator. Sometimes it involves homogeneous data
sources, but usually it is heterogeneous (DB2, Oracle, SQL Server, etc.). Once the allo-
cation decision has been made and the replicated data loaded into the system, the feder-
Search WWH ::




Custom Search