Databases Reference
In-Depth Information
Unpredictable queries also constitute a large part of the computation issues.
Unpredictable queries (as used in exploratory analysis) are, by definition, those where
you do not know in advance what the user may want to find out. These impose a number
of problems, including:
You do not know whether the answer to a query will require
aggregated data or access to transaction-level data. If the answer
can be satisfied by aggregated data then multi-dimensional cubes,
materialized views, and pre-defined time-series type of data
preparation may be appropriate.
Even where a query may be satisfied through use of pre-
aggregated data, the nature of unpredictable queries is such that
you cannot always guarantee that the correct aggregations are
available. If they are not then the materialized views and cubes
will need to be regenerated or in many cases if the queries are
looking for attributes that are not in the materialized views and
cubes, then you will have to revert to creation of new
pre-aggregation mechanisms.
The other problem with unpredictable queries is that appropriate
indexes may not be defined. While the database optimizer can
re-write badly constructed SQL, and determine the most efficient
joins and optimize the query path in general, it cannot do
anything for lack of indexes. In practice, if a column is not indexed
at all, then this will usually mean that the query has to perform
a full table scan, and if this is a large table then there will be a
substantial performance hit as a result.
One possible option is to build indexes on every conceivable column. Unfortunately
this is not usually practical. Every index you build will help to improve the performance
of queries that use that index, but at the same time every index you add to the database
increases the size of the database.
Note
Certain types of queries require that the whole of a table must be scanned. Some
of these arise when there are no available indexes, or from the sorts of complex queries
described above. However, very much simpler queries can also give rise to full table
scans. For example:
“List the full name and email address for customers born in July:”
Given that one in 12 customers are born in July, a typical database
optimizer will not consider it worthwhile to use an index, and it
will conduct a full table scan. If you have 10 million customers for
each of whom you store 3,200 bytes, for instance, then this will
mean reading a total of 32,000,000,000 bytes.
 
Search WWH ::




Custom Search