Geoscience Reference
In-Depth Information
to explore these distributions. We then try to draw conclusions from the
sample that can be applied to the larger population of interest ( hypothesis
testing ). Sections 3.6 to 3.12 introduce the most important statistical tests for
applications in earth sciences. h e i nal section in this chapter (Section 3.13)
introduces methods used to i t distributions to our own data sets.
3.2 Empirical Distributions
Let us assume that we have collected a number of measurements x i from a
specii c object. h e collection of data, or sample, as a subset of the population
of interest, can be written as a vector x , or one-dimensional array
containing a total of N observations. h e vector x may contain a large
number of data points and it may consequently be dii cult to understand its
properties. Descriptive statistics are therefore ot en used to summarize the
characteristics of the data. h e statistical properties of the data set may be
used to dei ne an empirical distribution, which can then be compared to a
theoretical one.
h e most straightforward way of investigating the sample characteristics
is to display the data in a graphical form. Plotting all of the data points
along a single axis does not reveal a great deal of information about the data
set. However, the density of the points along the scale does provide some
information about the characteristics of the data. A widely-used graphical
display of univariate data is the histogram (Fig. 3.1). A histogram is a bar
plot of a frequency distribution that is organized in intervals or classes . Such
histogram plots provide valuable information on the characteristics of the
data, such as the central tendency , the dispersion and the general shape of the
distribution. However, quantitative measures provide a more accurate way
of describing the data set than the graphical form. In purely quantitative
terms, the mean and the median dei ne the central tendency of the data set,
while the data dispersion is expressed in terms of the range and the standard
deviation .
Measures of Central Tendency
Parameters of central tendency or location represent the most important
measures for characterizing an empirical distribution (Fig. 3.2). h ese values
help locate the data on a linear scale. h ey represent a typical or best value
Search WWH ::




Custom Search