Database Reference
In-Depth Information
Example 9-9. Disable sequential scan, coerce index use
set enable_seqscan = false ;
EXPLAIN ( ANALYZE )
SELECT *
FROM census . lu_fact_types
WHERE fact_subcats && '{White alone, Black alone}' :: varchar [];
Bitmap Heap Scan on lu_fact_types (cost=12.02..14.04 rows=2 width=200) (actual
time=0.058..0.058 rows=2 loops=1) Recheck Cond: (fact_subcats && '{"White
alone","Black alone"}'::character varying[]) Heap Blocks: exact=1 -> Bitmap Index
Scan on idx_lu_fact_types
(cost=0.00..12.02 rows=2 width=0) (actual time=0.048..0.048 rows=2 loops=1)
Index Cond: (fact_subcats && '{"White alone","Black alone"}'::character
varying[])
Planning time: 0.230 ms
Execution time: 0.119 ms
From this plan, we learn that our index can be used, but because the estimated cost is
more than doing a sequential scan, the planner under normal circumstances will opt
for the sequential scan. The planner was right in its assessment because our index exe‐
cution time turns out to be a little more than a sequential scan. As we add more data to
our table, we'll probably find that the planner changes strategies to an index scan.
In contrast to the previous example, suppose we were to write a query of the form:
SELECT * FROM census . lu_fact_types WHERE 'White alone' = ANY ( fact_subcats );
We would discover that, regardless of how we set enable_seqscan , the planner will
always perform a sequential scan because the index we have in place can't service this
query. So it is important to consider which indexes will be useful and to write queries
to take advantage of them. And experiment, experiment, experiment!
Table Statistics
Despite what you might think or hope, the query planner is not a magician. Its decisions
follow prescribed logic that's far beyond the scope of this topic. The rules that the planner
follows depend heavily on the current state of the data. The planner can't possibly scan
all the tables and rows prior to formulating its plan. That would be self-defeating. In‐
stead, it relies on aggregated statistics about the data.
Having accurate and current stats is crucial for the planner to make the right decision.
If stats differ greatly from reality, the planner will often come up with bad plans, the
most detrimental of these being unnecessary sequential table scans. Generally, only
about 20 percent of the entire table is sampled to produce stats. This percentage could
be even lower for very large tables. You can control the number of rows sampled on a
column-by-column basis by setting the STATISTICS value.
Search WWH ::




Custom Search