Forecasting Supply Chain Demand Using Machine Learning Algorithms - Distributed Artificial Intelligence, Agent Technology, and Collaborative Applications

Information Technology Reference

In-Depth Information

Therefore for the current implementation, the number of weights will always be:

p ⋅ ⋅ + ⋅ + ⋅ + ⋅

10 1 10 10 1 1 1

Total Weights =

And the Observations to Weights ratio is:

(

)

n p w

p w h b h h o b o

⋅ ⋅ −

⋅ ⋅ + ⋅ + ⋅ + ⋅

1

Observations to Weights =

(

)

Therefore, for the chocolate manufacturer dataset, Observation to Weights ratio is:

(

)

100 38 1 0.05

38 0.05 10 1 10 10 1 1 1

⋅ ⋅ −

Observation to Weights =

= 90.25

(

)

⋅

⋅ + ⋅ + ⋅ + ⋅

We can see that we now have a very high observations-to-weights ratio, which will help us achieve

much better models, especially for very noisy data.

r esul ts

This section discusses the performance of the selected models on three data sets. Table 1 presents the mean

absolute errors (MAE) of all the tested forecasting techniques as applied to the chocolate manufacturer's

dataset in ascending order of testing set error with the best-performing techniques at the top of the list,

and the worst at the bottom. The results are also provided in similar format for the Toner Cartridge

manufacturer's dataset (Table 2) and the Statistics Canada manufacturing dataset (Table 3).

The results suggest that one of the ML approaches, the support vector machine (SVM) under the

Super Wide modeling approach is at the top on all three data sets by providing consistently better

performance. If we ignore the Super Wide models, we find that the results of previous research and

the M3-Competition were, in essence reproduced. That is, simple techniques outperform the more

complicated and sophisticated approaches. For example, in the two primary datasets of interests, the

Chocolate (Table 1) and Toner Cartridge (Table 2) manufacturer, exponential smoothing has the best

performance. They are at Rank 5 and Rank 3 respectively, immediately after the top Super Wide models.

This is especially true in our experiments since the data we are concerned with are very noisy and the

exponential smoothing (ES) approach outperformed all of the other approaches including the advanced

ML ones, in some cases by considerable margins. Noticeably, the Toner Cartridge data set was so noisy

or the patterns changed so much with time that even the exponential smoothing with a fixed parameter

of 20% outperformed (Table 2 - Rank 3) the automated one (Table 2 - Rank 5), which optimized the

parameter for the training set.

We observed the same problem with the Moving Average approach that was fixed to a window of

6 periods (Table 2 - Rank 4). The automatic versions likewise had overfitting problems and had lower

performance (Table 2 - Rank 7) than setting a constant parameter value (Table 2 - Rank 4). The aver-

age error of the automatic exponential smoothing for the two manufacturer's dataset is 0.7516 and the

average for the fixed exponential smoothing of 20% is 0.7501 and the difference has a significance of

0.4037. The moving average with a window of 6 periods has an average error of 0.7561 and a mean

difference significance of 0.2273 with the average of the automatic exponential smoothing. Although

we see this overfitting pattern repeats itself, the difference in means is statistically insignificant for the

Distributed Artificial Intelligence, Agent Technology, and Collaborative Applications

Search WWH ::

Custom Search

Home