Databases Reference
In-Depth Information
packages include bar charts, pie charts, and line graphs. Other popular displays of data
summaries and distributions include quantile plots , quantile-quantile plots , histograms ,
and scatter plots .
2.2.1 MeasuringtheCentralTendency:Mean,Median,andMode
In this section, we look at various ways to measure the central tendency of data. Suppose
that we have some attribute X , like salary , which has been recorded for a set of objects.
Let x 1 , x 2 ,
, x N be the set of N observed values or observations for X . Here, these val-
ues may also be referred to as the data set (for X ). If we were to plot the observations
for salary , where would most of the values fall? This gives us an idea of the central ten-
dency of the data. Measures of central tendency include the mean, median, mode, and
midrange.
The most common and effective numeric measure of the “center” of a set of data is
the (arithmetic) mean . Let x 1 , x 2 ,
:::
, x N be a set of N values or observations , such as for
some numeric attribute X , like salary . The mean of this set of values is
:::
X
x i
N D
x 1 C x 2 CC x N
N
i D1
N x D
.
(2.1)
This corresponds to the built-in aggregate function, average ( avg() in SQL), provided in
relational database systems.
Example2.6 Mean. Suppose we have the following values for salary (in thousands of dollars), shown
in increasing order: 30, 36, 47, 50, 52, 52, 56, 60, 63, 70, 70, 110. Using Eq. (2.1), we have
30C36C47C50C52C52C56C60C63C70C70C110
12
N x D
696
12 D 58.
Thus, the mean salary is $58,000.
D
, N .
The weights reflect the significance, importance, or occurrence frequency attached to
their respective values. In this case, we can compute
Sometimes, each value x i in a set may be associated with a weight w i for i D 1,
:::
X
w i x i
w 1 x 1 C w 2 x 2 CC w N x N
w 1 C w 2 CC w N
i D1
N x D
D
.
(2.2)
X
w i
i D1
This is called the weighted arithmetic mean or the weighted average .
 
Search WWH ::




Custom Search