Tables & Data - PostgreSQL 9 Administration

Database Reference

In-Depth Information

The sampling method shown earlier is a simple random sampling technique that has an

"equal probability of selection" (EPS) design.

EPS samples are considered useful because the variance of the sample attributes is similar

to the variance of the original data set. Though, bear in mind that this is only useful if you are

considering variances.

Simple random sampling can make the eventual sample biased towards more frequently

occurring data. For example, if you have 1% sample of data on which some kinds of data

occur only 0.001% of the time, you may end up with a data set that doesn't have any of that

outlying data.

What you might wish to do is to pre-cluster your data, and take different samples from

each group, to ensure that you have a sampled data set that includes many more outlying

attributes. A simple method might be to:

F Include 1% of all normal data

F Include 25% of outlying data

Note that if you do this, then it is no longer an "EPS" sample design.

See also

There are no doubt statisticians who will be in apoplexy after reading this. You're welcome to

use the facilities of the SQL language to create a more accurate sample. Please, just make

sure that you know what you're doing and/or check out some good statistical literature,

websites, or textbooks.

Loading data from a spreadsheet

Spreadsheets are the most obvious starting place for most data stores. Studies within a

range of businesses consistently show that more than 50% of smaller data stores are held in

spreadsheets or small desktop databases. Loading data from these sources is a frequent and

important task for many DBAs.

Getting ready

Spreadsheets combine data, presentation, and programs all in one file. That's perfect for power

users wanting to work quickly. Like other relational databases, PostgreSQL is mainly concerned

with the lowest level of data, so extracting just the data can present some challenges.

Search WWH ::

Custom Search

Home