Databases Reference
In-Depth Information
How it works...
The hash clusters use a hash function to identify the database block in which the rows with a
particular cluster key value are stored, replacing the index functionalities of index clusters.
In this recipe, we have created the EMP_DEPT_CLUSTER cluster to store information related
to DEPARTMENTS and EMPLOYEES , stored together using DEPARTMENT_ID as the cluster key.
We then created and populated the CL_DEPARTMENTS and CL_EMPLOYEES tables with the
data of the DEPARTMENTS and EMPLOYEES tables of the HR schema.
In step 5 and step 6, we analyzed a simple query against the EMPLOYEES table and its
hash-clustered counterpart CL_EMPLOYEES , to verify that. In the first query, we are using
the EMP_DEPARTMENT_IX index to access the rows corresponding to the department
identified by the ID 20. In the second query, we access only the clustered table segment
with a TABLE ACCESS HASH operation.
The performance gain is clear we are not accessing two segments—the index and the table — to
retrieve the queried data, but only the table segment.
There's more...
In hash-clustered tables, we have a greater saving in size, because we do not store any data
outside the rows in the clustered tables—no indexes—and the cluster key values won't be
duplicated as with index clusters.
The con is that we must know in advance the maximum cardinality of the tables, because in
the cluster creating phase, we need to declare the maximum number of different keys with
the HASHKEYS parameters. If the table grows over this value, we have to rebuild the cluster.
Someone could think of declaring a big HASHKEYS value, but this isn't a good idea. The space
required to store HASHKEYS rows in the cluster will be allocated when the cluster is created. If
we have oversized the hash cluster, there is a huge waste of space.
Hash clusters are useful when we know in advance—or can predict—the number of key values,
and when we perform regular queries on clustered tables using the equality predicate on the
cluster key.
Even for hash clusters, if we perform frequent full table scans, and we have to allocate a lot
of space for our growing table, it's better to use regular (heap) tables.
Sorted hash clusters
Beginning with Oracle Database 10 g , hash clusters can be sorted. We can decide the order
in which the rows are sorted—within the same cluster key value—even on multiple fields.
 
Search WWH ::




Custom Search