Database Reference
In-Depth Information
Query Optimization
Query optimizing processing must be distributed in order to minimize network
trafficking. To illustrate, consider a query Qa of site A, accessing to relations in a natural
join: Rb of site B and Rc of site C. The optimizer must decide on one of the following
strategies:
a.
Move copies of Rb and Rc to site A
b.
Move copy of Rb to site C and process the join there
c.
Move copy of Rc to site B and process the join there
The optimizer must be able to calculate what would be most economical alternative
(given the structure and configuration of the underlying network) and choose that
alternative. For example, Oracle implements two query optimization strategies — a rule-
based optimization and a cost-based optimization. Before executing a query, the query
optimizer optimizes the query by converting it to an internal format (based on an Oracle
algorithm) that will ensure the most efficient execution.
Catalog Management
Catalog management is one of the most complex issues that a distributed database must
resolve. This is so since additional information must be stored for the database objects
(e.g. fragmentation, replication, location etc.). Where and how the catalog should be
stored is a complicated issue. Below are some alternatives:
Centralized: The catalog is stored at a centralized location, and is
accessible to the other participating sites.
Fully Replicated: The catalog is replicated at each participating site.
Partitioned: Each site maintains its own catalog. The total catalog
is the union of each site catalog.
Hybrid: Each site maintains its own catalog; additionally,
a central site maintains the global catalog.
Each of these approaches has its related advantages and challenges. Resolving this
issue is often done with the use of simulation software, and much research into the matter.
Update Propagation
In the case where data is replicated at different sites, it may not be possible to effect
update to all replicas at the desired time. How is this resolved? The primary copy
approach is a common method of resolution:
One replica is deemed the primary copy. As soon as that copy is
updated, the update process is deemed completed.
The site with the primary copy is responsible for updating the
other sites as son as possible.
 
Search WWH ::




Custom Search