Database Reference
In-Depth Information
By age and sex:
mysql>
SELECT age, sex, COUNT(score) AS n,
->
SUM(score) AS sum,
->
MIN(score) AS minimum,
->
MAX(score) AS maximum,
->
AVG(score) AS mean,
->
STDDEV_SAMP(score) AS 'std. dev.',
->
VAR_SAMP(score) AS 'variance'
->
FROM testscore
->
GROUP BY age, sex;
+-----+-----+---+------+---------+---------+--------+-----------+----------+
| age | sex | n | sum | minimum | maximum | mean | std. dev. | variance |
+-----+-----+---+------+---------+---------+--------+-----------+----------+
| 5 | M | 2 | 9 | 4 | 5 | 4.5000 | 0.7071 | 0.5000 |
| 5 | F | 2 | 13 | 6 | 7 | 6.5000 | 0.7071 | 0.5000 |
| 6 | M | 2 | 17 | 8 | 9 | 8.5000 | 0.7071 | 0.5000 |
| 6 | F | 2 | 10 | 4 | 6 | 5.0000 | 1.4142 | 2.0000 |
| 7 | M | 2 | 14 | 6 | 8 | 7.0000 | 1.4142 | 2.0000 |
| 7 | F | 2 | 16 | 7 | 9 | 8.0000 | 1.4142 | 2.0000 |
| 8 | M | 2 | 15 | 6 | 9 | 7.5000 | 2.1213 | 4.5000 |
| 8 | F | 2 | 17 | 7 | 10 | 8.5000 | 2.1213 | 4.5000 |
| 9 | M | 2 | 16 | 7 | 9 | 8.0000 | 1.4142 | 2.0000 |
| 9 | F | 2 | 19 | 9 | 10 | 9.5000 | 0.7071 | 0.5000 |
+-----+-----+---+------+---------+---------+--------+-----------+----------+
15.3. Generating Frequency Distributions
Problem
You want to know the frequency of occurrence for each value in a table.
Solution
Derive a frequency distribution that summarizes the contents of your dataset.
Discussion
A common application for per-group summary techniques is to generate a
frequency
distribution
that shows how often each value occurs. For the
testscore
table, the fre‐
quency distribution looks like this:
mysql>
SELECT score, COUNT(score) AS counts
->
FROM testscore GROUP BY score;
+-------+--------+
| score | counts |
+-------+--------+
| 4 | 2 |
| 5 | 1 |
| 6 | 4 |