Information Technology Reference
In-Depth Information
to throw away information and also reduce the power of any tests and the
precision of any estimates.
DISTRIBUTION, CUMULATIVE DISTRIBUTION,
EMPIRICAL DISTRIBUTION, LIMITING DISTRIBUTION
Suppose we were able to examine all the items in a population and record
a value for each one to obtain a distribution of values. The cumulative dis-
tribution function of the population F [ x ] denotes the probability that an
item selected at random from this population will have a value less than or
equal to x . 0 £ F [ x ] £ 1. Also, if x < y , then F [ x ] £ F [ y ].
The empirical distribution, usually represented in the form of a cumula-
tive frequency polygon or a bar plot, is the distribution of values observed
in a sample taken from a population. If F n [ x ] denotes the cumulative dis-
tribution of observations in a sample of size n , then as the size of the
sample increases we have F n [ x ] Æ F [ x ].
The limiting distribution for very large samples of a sample statistic such
as the mean or the number of events in a large number of very small
intervals often tends to a distribution of known form such as the Gaussian
for the mean or the Poisson for the number of events.
Be wary of choosing a statistical procedures which is optimal only for a
limiting distribution and not when applied to a small sample. For a small
sample, the empirical distribution may be a better guide.
HYPOTHESIS, NULL HYPOTHESIS, ALTERNATIVE
The dictionary definition of a hypothesis is a proposition, or set of proposi-
tions, put forth as an explanation for certain phenomena.
For statisticians, a simple hypothesis would be that the distribution from
which an observation is drawn takes a specific form. For example, F [ x ] is
N (0,1). In the majority of cases, a statistical hypothesis will be compoun d
rather than simple—for example, that the distribution from which an
observation is drawn has a mean of zero.
Often, it is more convenient to test a null hypothesis —for example,
that there is no or null difference between the parameters of two
populations.
There is no point in performing an experiment or conducting a survey
unless one also has one or more alternate hypotheses in mind.
PARAMETRIC, NONPARAMETRIC, AND
SEMIPARAMETRIC MODELS
Models can be subdivided into two components, one systematic
and one random. The systematic component can be a function of certain
Search WWH ::




Custom Search