Geoscience Reference
In-Depth Information
with an odd number of observations, the median is the mid data point which
has an equal number of observations both above and below it. For a data
series with an even number of observations, it is the average of the two central
observations. In order to compute the median, first rank the observations from
the smallest ( x 1 ) observation to the largest ( x n ) observation and then use one
of the following equations depending on the number of observations ( n ):
x
n1
2
P 50 =
when n is odd
(5)
1 (
x
x
)
and
P 50 =
when n is even.
(6)
n
/2
n
/ 2 1
2
Opposite to the mean, the median is highly resistant and slightly affected
by the magnitude of a single observation in a data series, being determined
solely by the relative order of observations. This robustness to the effect of a
change in value or presence of outlier observations is often a desirable property.
The median is always preferred over the mean in case a robust summary
statistics is desired that is not strongly influenced by a few extremely low or
high observations. One such example is the expected daily rainfall to occur
across a network of raingauge stations for a given day. Suppose one of the
raingauge stations recorded unusually higher daily rainfall than that recorded
by the other raingauge stations. Using the median, one raingauge station with
unusually high daily rainfall will not have a greater effect on the expected
daily rainfall than raingauge stations with low daily rainfalls. However, if the
mean is used then the expected daily rainfall may be pulled towards the
outlier, and be higher than daily rainfalls recorded by most of the raingauge
stations.
2.1.3 Additional Measures of Location
In addition to classical and robust measures of location, four additional measures
of location are 'mode', 'geometric mean', 'harmonic mean' and 'trimmed
mean', which are less frequently used. Mode is defined as the most frequently
observed value in a given data series. It is the value having the highest bar in
a histogram. The mode is more applicable for the grouped data, data which are
recorded only as falling into a finite number of categories, than for the
continuous data. Although it is very easy to obtain, it is a poor measure of
location for the continuous data because its value often depends on the arbitrary
grouping of the data (Helsel and Hirsch, 2002). The geometric mean (GM) is
often used to compute summary statistic for positively skewed datasets. It is
the mean of the logarithms, transformed back to their original units:
n
Ë
Û
ln
x
n
Ç
i
exp
Ì
Ü
GM =
(7)
Ì
Ü
Í
Ý
i
1
 
Search WWH ::




Custom Search