Geoscience Reference
In-Depth Information
the bulk of the data (central 50%), while the length of the whiskers shows the
extent of the rest of the data (USEPA, 2006). The box plot divides the entire
data into four sections, each section containing 25% of the data. The whisker
extends to the most extreme data value within 1.5 times the interquartile range
of the data and indicates how tails of the distribution are stretched. The width
of the box has no specific meaning; the plot can be made quite narrow without
affecting its visual impact. The values beyond the ends of the whiskers are
unusually small or large data points, which are called outliers and are displayed
by a '' on the plot. A 'box and whisker plot' can be used to evaluate the
symmetry of the data (USEPA, 2006). If the data distribution is symmetrical,
the box is divided in two equal halves by the median, length of both the upper
and the lower whiskers will be the same and the number of extreme data
points will be distributed equally on either end of the plot. Since the 'box and
whisker plot' cannot be made so easily manually, STATISTICA software may
be used for creating this plot. The following steps are used for generating a
box and whisker plot:
Step 1: Choose the vertical scale of the plot based on the maximum and
minimum values of the time series data. Select a width for the box
plot keeping in mind that the width has no particular meaning and is
only a visualization tool. If the width is labelled as W , the horizontal
scale of the plot ranges from - ½ W to + ½ W .
Step 2: Compute the upper quartile [ Q (0.75) or the 75 th percentile] and the
lower quartile [ Q (0.25) or the 25 th percentile] based on time series
data. Compute the sample mean and median ( X m ) for the time series
data. Then, compute the interquartile range (IQR) where IQR = Q (0.75)
- Q (0.25) .
Step 3: Draw a box through four points [-½ W , Q (0.75) ], [-½ W , Q (0.25) ], [½ W ,
Q (0.25) ] and [½ W , Q (0.75) ]. Draw a line from [½ W , Q (0.5) ] to [-½ W ,
Q (0.5) ] and mark point (0, X m ) with (+). The line or point (0, X m )
indicates median of the data.
Step 4: Compute the upper end of the top whisker by finding the largest data
value X less than Q (0.75) + 1.5 × [ Q (0.75) - Q (0.25) ]. Draw a vertical line
from [0, Q (0.75) ] to (0, X m ). Compute the lower end of the bottom
whisker by finding the smallest data value Y greater than Q (0.25) - 1.5
× [ Q (0.75) - Q (0.25) ]. Draw a vertical line from [0, Q (0.25) ] to (0, Y ).
Step 5: For all points X > X (outliers and extremes), place an asterisk () at
the point (0, X ). For all points Y < Y (outliers and extremes), place
an asterisk () at the point (0, Y ).
3.1.4 Ranked Data Plot
A 'ranked data plot' is a useful graphical method that is easy to construct and
interpret, and does not depend upon any assumptions about a model for the
time series data. It is not subjective as the user does not have to make any
Search WWH ::




Custom Search