Database Reference
In-Depth Information
-> FROM rainfall AS t1 INNER JOIN rainfall AS t2
-> ON t1.date >= t2.date
-> GROUP BY t1.date;
+------------+--------------+-------------+--------------+-------------+
| date | daily precip | cum. precip | days elapsed | avg. precip |
+------------+--------------+-------------+--------------+-------------+
| 2014-06-01 | 1.50 | 1.50 | 1 | 1.500000 |
| 2014-06-03 | 0.50 | 2.00 | 2 | 1.000000 |
| 2014-06-05 | 1.00 | 3.00 | 3 | 1.000000 |
+------------+--------------+-------------+--------------+-------------+
To fix the problem, determine the number of days elapsed a different way. Take the
minimum and maximum date involved in each sum and calculate a days-elapsed value
from them:
DATEDIFF(MAX(t2.date),MIN(t2.date)) + 1
That value must be used for the days-elapsed column and for computing the running
averages. The resulting statement is as follows:
mysql> SELECT t1.date, t1.precip AS 'daily precip',
-> SUM(t2.precip) AS 'cum. precip',
-> DATEDIFF(MAX(t2.date),MIN(t2.date)) + 1 AS 'days elapsed',
-> SUM(t2.precip) / (DATEDIFF(MAX(t2.date),MIN(t2.date)) + 1)
-> AS 'avg. precip'
-> FROM rainfall AS t1 INNER JOIN rainfall AS t2
-> ON t1.date >= t2.date
-> GROUP BY t1.date;
+------------+--------------+-------------+--------------+-------------+
| date | daily precip | cum. precip | days elapsed | avg. precip |
+------------+--------------+-------------+--------------+-------------+
| 2014-06-01 | 1.50 | 1.50 | 1 | 1.500000 |
| 2014-06-03 | 0.50 | 2.00 | 3 | 0.666667 |
| 2014-06-05 | 1.00 | 3.00 | 5 | 0.600000 |
+------------+--------------+-------------+--------------+-------------+
As this example illustrates, calculation of cumulative values from relative values requires
only a column that enables rows to be placed into the proper order. (For the rainfall
table, that's the date column.) Values in the column need not be sequential, or even
numeric. This differs from calculations that produce difference values from cumulative
values (see Recipe 15.9 ), which require a table that has a column containing an unbroken
sequence.
The running averages in the rainfall examples are based on dividing cumulative pre‐
cipitation sums by number of days elapsed as of each day. When the table has no gaps,
the number of days is the same as the number of values summed, making it easy to find
successive averages. When rows are missing, the calculations become more complex.
This demonstrates that it's necessary to consider the nature of your data and calculate
averages appropriately. The next example is conceptually similar to the previous ones
Search WWH ::




Custom Search