Database Reference
In-Depth Information
Manager Jones wants to know the names of employees on his projects who make
more than $100,000.
SELECT E.EmpName
FROM PROJECT P, EMPLOYEE E, ASSIGNMENT A
WHERE P.ProjNo = A.ProjNo
AND E.EmpNo = A.EmpNo
AND P.ProjManager = 'Jones'
AND E.Salary > 100000
This is not a very complicated query. As you know, in terms of relational algebra,
the result is obtained by two select operations, two union operations, and one project
operation. In a centralized database system, all of these operations are performed
on the database at a single site. That is not the case with a distributed database
system. There are numerous ways that data can be moved around to perform these
operations in the most optimal manner. The DDBMS is expected to examine these
ways and choose the least expensive option.
Even in our simple distributed database system with just three relations and three
sites, several options are conceivable. Just to give you an idea of the complexity of
join query processing, let us list a few options. After the result is obtained, the result
is transmitted to the site where manager Jones is. Review the following list:
Perform select operation at site B on EMPLOYEE for salary > 100000. Move
result to site A. Complete all remaining operations at site A.
Move entire EMPLOYEE relation to site A. Perform all operations at site A.
Perform join operation on PROJECT and ASSIGNMENT relations at site
A. Perform select operation on the result for 'Jones.' For each of the rows
in this selection, send a message to B to check whether the corresponding
EMPLOYEE row shows salary > 100000. Get responses back from B. Select
only such EMPLOYEE rows. Perform project operation and obtain final result.
Move PROJECT and ASSIGNMENT relations to site B. Perform all opera-
tions at site B.
Perform select operation at site A on PROJECT relation for 'Jones.' Perform
join operation on this result and ASSIGNMENT relation at site A. Perform
select operation at site B on EMPLOYEE relation for salary > 100000. For
each of the selected rows send a message to site A to verify whether the selected
row relates to 'Jones.' Select only such rows. Perform project operation and
obtain final result.
Perform join operation on PROJECT and ASSIGNMENT relations at site A.
Perform select operation on result for 'Jones.' Move result to site B. Complete
remaining operations at site B.
The total data transmission time for each of these options can be computed based
on the number of rows in each relation, the number of qualifying rows in the
EMPLOYEE and PROJECT relations, the size of the records, and data transfer
rates between the sites.
Search WWH ::




Custom Search