Database Reference
In-Depth Information
Clustering for efficiency
Mostusersstopoptimizingtheperformanceofatableafteraddingtheappropriatein-
dexes.Thisusuallyhappensbecausetheperformancebecomes"goodenough".But
what if the table has millions or billions of records? This amount of information may
not fit in the database server's RAM, thereby forcing hard drive access. Generally,
table records are stored sequentially on the hard drive. But, the data being fetched
from the hard drive for a query may be accessing many different parts of the hard
drive.Havingtoaccessdifferentpartsofaharddriveisaknownperformancelimita-
tion.
Tomitigateharddriveperformanceissues,adatabasetablecanhaveitsrecordsre-
orderedontheharddrivesothatsimilarrecorddataarestorednexttoorneareach
other.Thereorderingofadatabasetableisknownas clustering andisusedwiththe
CLUSTER statement in PostgreSQL.
Getting ready
WewillusetheCaliforniaschools( caschools )andSanFranciscoboundaries( sf-
poly )tablesforthisrecipe.Ifneithertableisavailable,refertothefirstrecipeofthis
chapter.
The psql utilitywill be used for this recipe's queries as follows:
> psql -d chapter10
chapter10=# \timing on
How to do it...
Use the followingsteps to cluster a table:
1. Beforeusingthe CLUSTER statement,checkthetimeatwhichthequeryused
in the previous recipe was executed by executing the following commands:
SELECT
schoolid
FROM caschools sc
Search WWH ::




Custom Search