Databases Reference
In-Depth Information
Last but not least, we have looked at various ways of indexing, without any consideration as to
how indexes affect performance of operations other than as queries.
Surely, an index can improve query performance, reducing the number of I/O operations
needed to retrieve the data. However, index maintenance has a cost in DML operations—on
INSERTs and UPDATEs—when the database engine must update the index to reflect the
changes to the data in the table.
The index, furthermore, requests space to store itself. This aspect, however, is less relevant
due to inexpensive disks and memory on the market, but has to be considered when planning
to add an index on a table.
What is the "small percentage" of the data which assures we can
improve performances using B-tree indexes?
There isn't a rule of thumb to define what is the right percentage above which there is no
improvement in the query using the index. We can however estimate the amount of data that
has to be read to satisfy our inquiries.
Suppose that we have a table with an average row size of 200 bytes, and we create an index
on some fields, which can be used to execute our query. The database block size is 8 KB,
that is, 8192 bytes available. If we are using a PCTFREE of 20 for the table, we are allowed
to insert about 30/32 rows in a database block—we are not considering row headers—we
are interested in an estimated value, not a calculation of the real value. If the table contains
10000 rows, we are using more than 300 database blocks for the table.
Accessing the table using the index requires us to read the index block and access the data
in the table by the ROWID . If the data we are retrieving with the index is 30 percent of the
entire table (as per our example) we will access 3000 rows using an operation called TABLE
ACCESS BY ROWID , reading 3000 blocks of data (many blocks will be read more than once).
If we do a full table scan, in this situation, we will read all table data, which we have estimated
to be slightly more than 300 blocks. Using the index, in this situation, won't be a good choice.
If the data we are interested in is one percent of the entire table, we will access 100 database
blocks using the index, which is a considerable speed improvement when compared to the
full table scan. Finally, if the average row size is 2000 bytes, we will have about three rows
in a database block, so the table is made up of more than 3300 database blocks. In this
situation, even the first query which accessed 30 percent of the entire table, would benefit
from indexing.
As we have seen from this example, whether to use indexes depends upon various
parameters, and there isn't a magic number to use for every situation.
 
Search WWH ::




Custom Search