Database Reference
In-Depth Information
To get a sense of what the planner uses, query the
pg_stats
table, as illustrated in
Example 9-10
:
SELECT
attname
As
colname
,
n_distinct
,
most_common_vals
AS
common_vals
,
most_common_freqs
As
dist_freq
FROM
pg_stats
WHERE
tablename
=
'facts'
ORDER
BY
schemaname
,
tablename
,
attname
;
Example 9-10. Data distribution histogram
colname | n_distinct | common_vals | dist_freq
-------------+------------+------------------+-----------------
fact_type_id | 68 | {135,113... | {0.0157,0.0156333,...
perc | 985 | {0.00,... | {0.1845,0.0579333,0.056...
tract_id | 1478 | {25025090300... | {0.00116667,0.00106667,0.0...
val | 3391 | {0.000,1.000,2...| {0.2116,0.0681333,0...
yr | 2 | {2011,2010} | {0.748933,0.251067}
pg_stats
gives the planner a sense of how actual values are dispersed within a given
column and lets it plan accordingly. The
pg_stats
table is constantly updated as a
background process. After a large data load or a major deletion, you should manually
update the stats by executing
VACUUM ANALYZE
.
VACUUM
permanently removes deleted
rows from tables;
ANALYZE
updates the stats.
For columns that participate often in joins and are used heavily in
WHERE
clauses, you
should consider increasing sampled rows.
ALTER
TABLE
census
.
facts
ALTER
COLUMN
fact_type_id
SET
STATISTICS
1000
;
Random Page Cost and Quality of Drives
Another setting that influences the planner is the
random_page_cost
(RPC) ratio, which
is the relative cost of the disk in retrieving a record using a sequential read versus using
random access. Generally, the faster (and more expensive) the physical disk, the lower
the ratio. The default value for RPC is 4, which works well for most mechanical hard
drives on the market today. The use of solid-state drives (SSDs), high-end storage area
networks (SANs), or cloud storage makes it worth tweaking this value.
You can set the RPC ratio per database, per server, or per tablespace. At the server level,
it makes most sense to set the ratio in the
postgresql.conf
file. If you have different kinds
command:
ALTER
TABLESPACE
pg_default
SET
(
random_page_cost
=
2
);