Forecasting Supply Chain Demand Using Machine Learning Algorithms - Distributed Artificial Intelligence, Agent Technology, and Collaborative Applications

Information Technology Reference

In-Depth Information

than that of the individual manufacturers because of the aggregation effect. In summary, the three data

sources provided us with a total of 300 time series for our experiments.

eXPeri Ment al design

We conducted experiments adopting a representative set of traditional forecasting techniques as a control

group and a set of machine learning techniques as a treatment group. To compare the two groups, every

technique from each group was used to forecast demand one month into the future for all of the 100

series for each of the three datasets previously identified. This resulted in a series of 4700 data points

for the chocolate manufacturer, 6500 for the toner cartridge manufacturer and 14,800 for the Statistics

Canada dataset for every technique tested. However, since all forecasting techniques require past data

to make a forecast into the future, there was a predetermined startup period specific to each algorithm,

which slightly reduces the number of forecast observations.

Additionally, the demand time series were formally separated into training sets and testing sets.

This is particularly important for the ML techniques, where the training set was used for ML models

to learn the demand patterns and the testing set used to estimate how well the forecasting capability

could generalize in the future. The main performance measure that we used to test the hypotheses was

the absolute error (AE) for every forecast data point. To make the absolute error comparable across

products, we normalized this measure by dividing it by the standard deviation of the training set. Thus,

the performance of different techniques was compared in terms of normalized absolute error (NAE)

using a t-test to determine if there was statistical difference in the error (forecasting performance) of

the techniques.

experimental Procedure

To test the proposed hypothesis, we executed all of the forecasting algorithms on the demand series

from the three datasets. The first step to the implementation of this experiment was the preparation of

the data and the separation into training and testing sets. Since ML techniques require large amounts

of data in order to properly detect true patterns, we used 80% of the time series data for training and

20% of the data for testing. In the second step, we employed all of the selected techniques to produce

forecasts. All of the data processing and forecasting was performed in the MATLAB 7.0 environment

(MathWorks, 2005c).

To illustrate, in the chocolate factory dataset, the training set contains 80% of the data, thus 38

months of demand and the testing set will contain the other 20% of the data that is 9 months of demand.

This represents data from October 2000 to November 2003 for the training set and December 2003 to

August 2004 for the testing set. The testing set contained the total of 900 forecast observations used

for comparing performance of forecasting techniques.

Some forecasting algorithms, such as multiple linear regression, neural networks and support vector

machines, require windowed data, i.e., past data that is used to predict future demand. For example a

window size of 3 months could be defined, thus indicating that this current month's data and the data

from the previous 2 months were used to predict next month's data. For some of the simpler forecasting

techniques such as the moving average, trend, and exponential smoothing, we implemented two ver-

Search WWH ::

Custom Search

Home