Database Reference
In-Depth Information
To get a sense of what the planner uses, query the pg_stats table, as illustrated in
Example 9-10 :
SELECT
attname As colname ,
n_distinct ,
most_common_vals AS common_vals ,
most_common_freqs As dist_freq
FROM pg_stats
WHERE tablename = 'facts'
ORDER BY schemaname , tablename , attname ;
Example 9-10. Data distribution histogram
colname | n_distinct | common_vals | dist_freq
-------------+------------+------------------+-----------------
fact_type_id | 68 | {135,113... | {0.0157,0.0156333,...
perc | 985 | {0.00,... | {0.1845,0.0579333,0.056...
tract_id | 1478 | {25025090300... | {0.00116667,0.00106667,0.0...
val | 3391 | {0.000,1.000,2...| {0.2116,0.0681333,0...
yr | 2 | {2011,2010} | {0.748933,0.251067}
pg_stats gives the planner a sense of how actual values are dispersed within a given
column and lets it plan accordingly. The pg_stats table is constantly updated as a
background process. After a large data load or a major deletion, you should manually
update the stats by executing VACUUM ANALYZE . VACUUM permanently removes deleted
rows from tables; ANALYZE updates the stats.
For columns that participate often in joins and are used heavily in WHERE clauses, you
should consider increasing sampled rows.
ALTER TABLE census . facts ALTER COLUMN fact_type_id SET STATISTICS 1000 ;
Random Page Cost and Quality of Drives
Another setting that influences the planner is the random_page_cost (RPC) ratio, which
is the relative cost of the disk in retrieving a record using a sequential read versus using
random access. Generally, the faster (and more expensive) the physical disk, the lower
the ratio. The default value for RPC is 4, which works well for most mechanical hard
drives on the market today. The use of solid-state drives (SSDs), high-end storage area
networks (SANs), or cloud storage makes it worth tweaking this value.
You can set the RPC ratio per database, per server, or per tablespace. At the server level,
it makes most sense to set the ratio in the postgresql.conf file. If you have different kinds
of disks, you can set the values at the tablespace level using the ALTER TABLESPACE
command:
ALTER TABLESPACE pg_default SET ( random_page_cost = 2 );
Search WWH ::




Custom Search