Database Reference
In-Depth Information
resultset. This is data where the number of distinct items in the set of rows divided by the number of rows is a small
number (near zero). For example, a GENDER column might take on the values M , F , and NULL . If you have a table with
20,000 employee records in it, then you would find that 3/20000 = 0.00015. Likewise, 100,000 unique values out of
10,000,000 results in a ratio of 0.01—again, very small. These columns would be candidates for bitmap indexes.
They probably would not be candidates for a having B*Tree indexes, as each of the values would tend to retrieve an
extremely large percentage of the table. B*Tree indexes should be selective in general, as outlined earlier. Bitmap
indexes should not be selective—on the contrary, they should be very unselective in general.
Bitmap indexes are extremely useful in environments where you have lots of ad hoc queries, especially
queries that reference many columns in an ad hoc fashion or produce aggregations such as COUNT . For example,
suppose you have a large table with three columns: GENDER , LOCATION , and AGE_GROUP . In this table, GENDER has
a value of M or F , LOCATION can take on the values 1 through 50 , and AGE_GROUP is a code representing 18 and
under , 19-25 , 26-30 , 31-40 , and 41 and over . You have to support a large number of ad hoc queries that take the
following form:
select count(*)
from t
where gender = 'M'
and location in ( 1, 10, 30 )
and age_group = '41 and over';
select *
from t
where ( ( gender = 'M' and location = 20 )
or ( gender = 'F' and location = 22 ))
and age_group = '18 and under';
select count(*) from t where location in (11,20,30);
select count(*) from t where age_group = '41 and over' and gender = 'F';
You would find that a conventional B*Tree indexing scheme would fail you. If you wanted to use an index to get
the answer, you would need at least three and up to six combinations of possible B*Tree indexes to access the data
via the index. Since any of the three columns or any subset of the three columns may appear, you would need large
concatenated B*Tree indexes on the following:
GENDER , LOCATION , AGE_GROUP : For queries that used all three, or GENDER with LOCATION ,
or GENDER alone
LOCATION , AGE_GROUP : For queries that used LOCATION and AGE_GROUP or LOCATION alone
AGE_GROUP , GENDER : For queries that used AGE_GROUP with GENDER or AGE_GROUP alone
To reduce the amount of data being searched, other permutations might be reasonable as well in order to
decrease the size of the index structure being scanned. This is ignoring the fact that a B*Tree index on such low
cardinality data is not a good idea.
Here is where the bitmap index comes into play. With three small bitmap indexes, one on each of the
individual columns, you will be able to satisfy all of the previous predicates efficiently. Oracle will simply use
the functions AND, OR, and NOT, with the bitmaps of the three indexes together, to find the solution set for any
predicate that references any set of these three columns. It will take the resulting merged bitmap, convert the 1s
into rowids if necessary, and access the data (if you are just counting rows that match the criteria, Oracle will just
 
Search WWH ::




Custom Search