Database Tables - Expert Oracle Database Architecture

Database Reference

In-Depth Information

Index Clustered Tables Wrap-up

Clustered tables give you the ability to physically prejoin data together. You use clusters to store related data from

many tables on the same database block. Clusters can help read-intensive operations that always join data together or

access related sets of data (e.g., everyone in department 10 ).

Clustered tables reduce the number of blocks that Oracle must cache. Instead of keeping ten blocks for ten

employees in the same department, Oracle will put them in one block and therefore increase the efficiency of your

buffer cache. On the downside, unless you can calculate your SIZE parameter setting correctly, clusters may be

inefficient with their space utilization and can tend to slow down DML-heavy operations.

Hash Clustered Tables

Hash clustered tables are very similar in concept to the index clustered tables just described with one main exception:

the cluster key index is replaced with a hash function. The data in the table is the index; there is no physical index.

Oracle will take the key value for a row, hash it using either an internal function or one you supply, and use that to

figure out where the data should be on disk. One side effect of using a hashing algorithm to locate data, however, is

that you cannot range scan a table in a hash cluster without adding a conventional index to the table. In an index

cluster, the query

select * from emp where deptno between 10 and 20

would be able to make use of the cluster key index to find these rows. In a hash cluster, this query would result in a

full table scan unless you had an index on the DEPTNO column. Only exact equality searches (including IN lists and

subqueries) may be made on the hash key without using an index that supports range scans.

In a perfect world, with nicely distributed hash key values and a hash function that distributes them evenly over

all of the blocks allocated to the hash cluster, we can go straight from a query to the data with one I/O. In the real

world, we will end up with more hash key values hashing to the same database block address than fit on that block.

This will result in Oracle having to chain blocks together in a linked list to hold all of the rows that hash to this block.

Now, when we need to retrieve the rows that match our hash key, we might have to visit more than one block.

Like a hash table in a programming language, hash tables in the database have a fixed size. When you create the

table, you must determine the number of hash keys your table will have, forever. That does not limit the amount of

rows you can put in there.

In Figure 10-9 , we can see a graphical representation of a hash cluster with table EMP created in it. When the client

issues a query that uses the hash cluster key in the predicate, Oracle will apply the hash function to determine which

block the data should be in. It will then read that one block to find the data. If there have been many collisions, or

the SIZE parameter to the CREATE CLUSTER was underestimated, Oracle will have allocated overflow blocks that are

chained off the original block.

Search WWH ::

Custom Search

Home