Overview of practical aspects in physical database design.
to additionally consider updates, we can include queries that modify
the corresponding tables and therefore can analyze a specific problem
instance in detail.
Synthetically generated queries: Before deploying a new application,
developers are usually aware of the set of queries that would be exe-
cuted in the production environment via stored procedures. However,
database administrators (DBAs) might not know in advance additional
details about the query mix, or the specific values for which stored pro-
cedures would be instantiated and executed. In that case, a common
approach is to synthetically generate query workloads that follow ex-
pected distributions and common parameter values. These workloads
can be useful to tune a system before it goes into production.
Automatically profiled queries: The best approach to obtain represen-
tative workloads is to actually monitor and capture queries as they are
processed by the database system. This approach results in the most
representative workloads but is usually associated with larger overheads.
The reason is that each query that is executed needs to be additionally
logged by a query profiler. There are, however, some mechanisms to
mitigate this overhead. First, we can profile queries for short periods
of time, effectively sampling the workload. Alternatively, some sophisti-
cated profiling tools can selectively capture queries that satisfy certain
predicates, such as those issued by a given user or defined over a spe-
cific set of tables. Finally, we can piggyback on top of existing database
functionality and obtain relatively good approximations of recent queries
being executed. An example of this approach leverages the plan cache
(which stores frequently used plans and therefore avoids reoptimizing
queries). Since the plan cache contains queries that are either expen-
sive or frequently executed, periodically flushing the content of the plan
cache into a local repository can result in representative workloads that
are obtained without incurring significant overhead.