Information Technology Reference
In-Depth Information
sions: ones with the pre-set parameter, and the “automatic” versions, whereby the parameters, which
resulted in the most accurate forecasts on the training set were calculated.
Although, as mentioned previously, exponential smoothing performs well in many forecasting
problems, the choice of the initial value may have a significant impact on the accuracy of forecasts.
The exponential smoothing implementation in MATLAB Financial Toolbox (MathWorks, 2005a) and
in Excel use the first value as the initial value, but, other implementations such as SPSS use the series
average as the starting value. For this reason we implemented both approaches in our exponential
smoothing and Theta models. The main purpose of the Multiple Linear Regression model is to provide
a linear benchmark for all of the auto-regressive type models such as the neural networks and support
vector machines.
ar Ma
The ARMA model combines both Auto-Regressive forecast and a Moving Average forecast (Box et
al., 1994). To minimize the error, we optimized the lag used in the auto-regression portion and the lag
used in the moving average portion. This functionality is provided by the MATLAB GARCH Toolbox
(MathWorks, 2005b). The ARMAX model is optimized to minimize the error using the Optimization
Toolbox (MathWorks, 2005e). Only the ARMA part of the ARMAX model was used in the current
experiments.
Theta Model
We have used the version of the Theta model (Assimakopoulos & Nikolpoulos, 2000) used in M3 fore-
casting competition. First, the linear trend was calculated, and then exponential smoothing performed
on double the difference between the raw data and trend values to minimize the error on the training
set. The two individual series, the linear trend and the optimized exponential smoothing on the decom-
posed series were recombined by an average of the two. As already mentioned, we implemented both
versions of the Theta model, one with the first observation of the time series as the initialization value
and the other with the average of the training set as the initialization value.
neural network details
Neural networks, while being universal approximators may suffer from the “overfitting” problem: i.e.
building complex non-linear mappings when different mappings are actually required. Overfitting
leads to poor generalization and can be combated by adding more data to the training set or keeping
the learning power (size) of the network low. Setting a window size of 5% of the training set data for
the regular time series data models, results in a ratio of 1 input to 20 observations. Therefore, to pro-
vide appropriate level of non-linearity and additional modeling power, we created one hidden layer that
contained 2 neurons with non-linear transfer functions. Even then, with the small datasets there is still
a danger of overfitting the data. The total number of weights for a neural network with one hidden layer
can be calculated as follows:
Total Weights = p w h b h h o b o
⋅ ⋅ + ⋅ + ⋅ + ⋅ ,
Search WWH ::




Custom Search