Database Reference
In-Depth Information
Dept(deptno, dname)
Emp_Dept_Cluster
Emp(ename, deptno)
20
KING
10
10 ACCOUNTING
10
RESEARCH
CLARK
MILLER
10
10
20
20
20
20
20 RESEARCH
ACCOUNTING
SMITH
JONES
SCOTT
ADAMS
KING
CLARK
MILLER
30 SALES
40 OPERATIONS
SMITH
JONES
SCOTT
ADAMS
Figure 10-8. Index clustered data
So, let's look at how we might go about creating a clustered object. Creating a cluster of tables in the object is
straightforward. The definition of the storage of the object ( PCTFREE , PCTUSED , INITIAL , and so on) is associated with
the CLUSTER , not the tables. This makes sense since there will be many tables in the cluster, and they will be on the
same block. Having different PCTFREE s would not make sense. Therefore, a CREATE CLUSTER statement looks a lot like
a CREATE TABLE statement with a small number of columns (just the cluster key columns):
EODA@ORA12CR1> create cluster emp_dept_cluster
2 ( deptno number(2) )
3 size 1024
4 /
Cluster created.
Here, we have created an index cluster (the other type being a hash cluster , which we'll look at in a coming section
“Hash Clustered Tables”). The clustering column for this cluster will be the DEPTNO column. The columns in the tables
do not have to be called DEPTNO but they must be NUMBER (2) to match this definition. We have, on the cluster definition,
a SIZE 1024 option. This is used to tell Oracle that we expect about 1,024 bytes of data to be associated with each
cluster key value. Oracle will use that to compute the maximum number of cluster keys that could fit per block.
Given that we have an 8KB block size, Oracle will fit up to seven cluster keys (but maybe less if the data is larger than
expected) per database block. For example, the data for departments 10 , 20 , 30 , 40 , 50 , 60 , and 70 would tend to go
onto one block, and as soon as we insert department 80 , a new block will be used. This does not mean that the data is
stored in a sorted manner; it just means that if we inserted the departments in that order, they would naturally tend to
be put together. If we inserted the departments in the order 10 , 80 , 20 , 30 , 40 , 50 , 60 , and then 70 , the final department
( 70) would tend to be on the newly added block. As we'll see next, both the size of the data and the order in which the
data is inserted will affect the number of keys we can store per block.
The SIZE parameter therefore controls the maximum number of cluster keys per block. It is the single largest
influence on the space utilization of our cluster. Set the size too high, and we'll get very few keys per block and we'll
use more space than we need. Set the size too low, and we'll get excessive chaining of data, which offsets the purpose
of the cluster to store all of the data together on a single block. It is the most important parameter for a cluster.
Next, we need to index the cluster before we can put data in it. We could create tables in the cluster right now,
but we're going to create and populate the tables simultaneously, and we need a cluster index before we can have any
data. The cluster index's job is to take a cluster key value and return the block address of the block that contains that
 
Search WWH ::




Custom Search