Geoscience Reference
In-Depth Information
ans =
12.4712
which does not dif er by very much in this particular example. However, we
will see later that this dif erence can be signii cant for distributions that are
not symmetric. A more general parameter to dei ne fractions of the data less
than, or equal to, a certain value is the quantile. Some of the quantiles have
special names, such as the three quartiles dividing the distribution into four
equal parts, 0-25%, 25-50%, 50-75% and 75-100% of the total number of
observations. We use the function quantile to compute the three quartiles.
quantile(corg,[.25 .50 .75])
ans =
11.4054 12.4712 13.2965
Less than 25% of the data values are therefore lower than 11.4054, 25% are
between 11.4054 and 12.4712, another 25% are between 12.4712 and 13.2965,
and the remaining 25% are higher than 13.2965.
h e third parameter in this context is the mode, which is the midpoint
of the interval with the highest frequency. h e MATLAB function mode to
identify the most frequent value in a sample is unlikely to provide a good
estimate of the peak in continuous probability distributions, such as the one
in corg . Furthermore, the mode function is not suitable for i nding peaks in
distributions that have multiple modes. In these cases it is better to compute a
histogram and calculate the peak of that histogram. We can use the function
find to locate the class that has the largest number of observations.
v(find(n == max(n)))
or simply
v(n == max(n))
ans =
12.3107
Both statements are identical and identify the largest element in n . h e index
of this element is then used to display the midpoint of the corresponding
class v . If there are several elements in n with similar values this statement
returns several solutions, suggesting that the distribution has several modes.
h e median, quartiles, minimum, and maximum of a data set can be
summarized and displayed in a box and whisker plot .
boxplot(corg)
Search WWH ::




Custom Search