Database Reference
In-Depth Information
Index Clustered Tables Wrap-up
Clustered tables give you the ability to physically prejoin data together. You use clusters to store related data from
many tables on the same database block. Clusters can help read-intensive operations that always join data together or
access related sets of data (e.g., everyone in department 10 ).
Clustered tables reduce the number of blocks that Oracle must cache. Instead of keeping ten blocks for ten
employees in the same department, Oracle will put them in one block and therefore increase the efficiency of your
buffer cache. On the downside, unless you can calculate your SIZE parameter setting correctly, clusters may be
inefficient with their space utilization and can tend to slow down DML-heavy operations.
Hash Clustered Tables
Hash clustered tables are very similar in concept to the index clustered tables just described with one main exception:
the cluster key index is replaced with a hash function. The data in the table is the index; there is no physical index.
Oracle will take the key value for a row, hash it using either an internal function or one you supply, and use that to
figure out where the data should be on disk. One side effect of using a hashing algorithm to locate data, however, is
that you cannot range scan a table in a hash cluster without adding a conventional index to the table. In an index
cluster, the query
select * from emp where deptno between 10 and 20
would be able to make use of the cluster key index to find these rows. In a hash cluster, this query would result in a
full table scan unless you had an index on the DEPTNO column. Only exact equality searches (including IN lists and
subqueries) may be made on the hash key without using an index that supports range scans.
In a perfect world, with nicely distributed hash key values and a hash function that distributes them evenly over
all of the blocks allocated to the hash cluster, we can go straight from a query to the data with one I/O. In the real
world, we will end up with more hash key values hashing to the same database block address than fit on that block.
This will result in Oracle having to chain blocks together in a linked list to hold all of the rows that hash to this block.
Now, when we need to retrieve the rows that match our hash key, we might have to visit more than one block.
Like a hash table in a programming language, hash tables in the database have a fixed size. When you create the
table, you must determine the number of hash keys your table will have, forever. That does not limit the amount of
rows you can put in there.
In Figure 10-9 , we can see a graphical representation of a hash cluster with table EMP created in it. When the client
issues a query that uses the hash cluster key in the predicate, Oracle will apply the hash function to determine which
block the data should be in. It will then read that one block to find the data. If there have been many collisions, or
the SIZE parameter to the CREATE CLUSTER was underestimated, Oracle will have allocated overflow blocks that are
chained off the original block.
 
Search WWH ::




Custom Search