Information Technology Reference
In-Depth Information
Therefore for the current implementation, the number of weights will always be:
p ⋅ ⋅ + ⋅ + ⋅ + ⋅
10 1 10 10 1 1 1
Total Weights =
And the Observations to Weights ratio is:
(
)
n p w
p w h b h h o b o
⋅ ⋅ −
⋅ ⋅ + ⋅ + ⋅ + ⋅
1
Observations to Weights =
(
)
Therefore, for the chocolate manufacturer dataset, Observation to Weights ratio is:
(
)
100 38 1 0.05
38 0.05 10 1 10 10 1 1 1
⋅ ⋅ −
Observation to Weights =
= 90.25
(
)
⋅ + ⋅ + ⋅ + ⋅
We can see that we now have a very high observations-to-weights ratio, which will help us achieve
much better models, especially for very noisy data.
r esul ts
This section discusses the performance of the selected models on three data sets. Table 1 presents the mean
absolute errors (MAE) of all the tested forecasting techniques as applied to the chocolate manufacturer's
dataset in ascending order of testing set error with the best-performing techniques at the top of the list,
and the worst at the bottom. The results are also provided in similar format for the Toner Cartridge
manufacturer's dataset (Table 2) and the Statistics Canada manufacturing dataset (Table 3).
The results suggest that one of the ML approaches, the support vector machine (SVM) under the
Super Wide modeling approach is at the top on all three data sets by providing consistently better
performance. If we ignore the Super Wide models, we find that the results of previous research and
the M3-Competition were, in essence reproduced. That is, simple techniques outperform the more
complicated and sophisticated approaches. For example, in the two primary datasets of interests, the
Chocolate (Table 1) and Toner Cartridge (Table 2) manufacturer, exponential smoothing has the best
performance. They are at Rank 5 and Rank 3 respectively, immediately after the top Super Wide models.
This is especially true in our experiments since the data we are concerned with are very noisy and the
exponential smoothing (ES) approach outperformed all of the other approaches including the advanced
ML ones, in some cases by considerable margins. Noticeably, the Toner Cartridge data set was so noisy
or the patterns changed so much with time that even the exponential smoothing with a fixed parameter
of 20% outperformed (Table 2 - Rank 3) the automated one (Table 2 - Rank 5), which optimized the
parameter for the training set.
We observed the same problem with the Moving Average approach that was fixed to a window of
6 periods (Table 2 - Rank 4). The automatic versions likewise had overfitting problems and had lower
performance (Table 2 - Rank 7) than setting a constant parameter value (Table 2 - Rank 4). The aver-
age error of the automatic exponential smoothing for the two manufacturer's dataset is 0.7516 and the
average for the fixed exponential smoothing of 20% is 0.7501 and the difference has a significance of
0.4037. The moving average with a window of 6 periods has an average error of 0.7561 and a mean
difference significance of 0.2273 with the average of the automatic exponential smoothing. Although
we see this overfitting pattern repeats itself, the difference in means is statistically insignificant for the
 
Search WWH ::




Custom Search