Tables & Data - PostgreSQL 9 Administration

Database Reference

In-Depth Information

We would then need to use our judgment to decide whether one or both of those columns are

unique by chance, or as part of the design of the database that created them.

It's possible that there is no single column that uniquely identifies the rows. Multiple column

keys are fairly common. If none of the columns were unique, then we would start to look for

unique keys that are combinations of the most unique columns. The following query shows

a frequency distribution for the table: a value occurs twice in one case, and another value

occurs only once.

postgres=# SELECT count as num_of_values, count(*)

FROM (SELECT customerid, count(*)

FROM ord

GROUP BY customerid) s

GROUP BY count

ORDER BY count(*);

num_of_values | count

---------------+-------

2 | 1

1 | 1

(2 rows)

and we can change the query to include multiple columns, like the following:

SELECT count as num_of_values, count(*)

FROM (SELECT column1, column2.... columnN

,count(*)

FROM ord

GROUP BY column1, column2.... columnN

) s

GROUP BY count

ORDER BY count(*);

This query will result in just one row, once we find a set of columns that is unique. As we

get closer to finding the key, we will see that the distribution gets tighter and tighter.

How it works...

Finding a unique key is possible for a program, though in most cases, a human can do this

much faster by looking at things like column names, foreign keys, or business understanding

to reduce the number of searches required by the brute-force approach.

ANALYZE works by taking a sample of the table data, and then performing a statistical

analysis of the results. The n_distinct value is the estimate of the "number of distinct

values" for the column.

Search WWH ::

Custom Search

Home