Databases Reference
In-Depth Information
each of the two tables. There are 10,000 jobs per day, with 200 days on record, for a
total of 2,000,000 rows in the Line_Item table. Since we will be doing table scans, we
will assume 64 KB prefetch buffers for the I/O read and write operations. The
Line_Item calculations are:
Average rows per prefetch buffer
= floor((65,536 bytes/block)/(100 bytes/row)) = 655.
Number of buffers
= ceiling(2,000,000 rows/(655 rows/buffer)) = 3,054.
The Job_Costing table contains 50 rows average per job, for 10,000 jobs per day,
with 200 days on record, for a total of 100,000,000 rows. The Job_Costing calculations
are:
Average rows per buffer
= floor((65,536 bytes/block)/(52 bytes/row)) = 1,260.
Number of buffers = ceiling(100,000,000 rows/(1,260 rows/buffer))
= 79,366.
Since we are joining the two tables in full, a merge-join is efficient. The data in the
Job_Costing table is ordered by job_id. The Line_Item table needs to be sorted by
job_id in order to do a merge-join. The Line_Item table is small enough to sort in
memory, so we only need one sequential scan to load the table into memory.
join cost (shared disk) = scan time (Line_Item and Job_Costing tables using 64 KB
prefetch buffers)
= (3,054 + 79,366 buffers)
×
5.8 ms
×
= 82,420
5.8 ms
= 478 seconds
8 minutes.
This is a bit long for a query, especially when you have to do several of them and
the 10 managers decide to run your excellent profit analysis program to obtain informa-
tion for their big meeting, which begins in five minutes. Soon there is a knocking at
your door! What can we do? We can calculate the necessary results during off hours, and
store the processed data in a materialized view. Then when the managers come in, they
can obtain results blazingly fast! Figure 5.2 shows the schema of the materialized view.
First, let's calculate the time to create the view, and then we'll compute the time to
obtain our results from the Profit_by_Job table. The view is created by running the
original query and then writing the results to the Profit_by_Job table. The number of
block (buffer) accesses for the join is the same, but the I/O time is now shorter because
the environment is dedicated during off hours.
Search WWH ::




Custom Search