Graphics Reference
In-Depth Information
kernel functions allow for su ciently rich feature spaces, the performances of SVMs
are comparable in terms of out-of-sample forecasting accuracy (Vapnik, ).
Company Score Evaluation
4.3
he company score is computed as:
x w
f
(
x
)=
+
b ,
( . )
n
i = α i y i x i and b
where w
w; x + and x are the observations from
the opposite classes for which constraint ( . ) becomes equality. By substituting the
scalar product with a kernel function, we will derive a nonlinear score function:
=
=
(
x +
+
x
)
n
i =
f
x
K
x i , x
α i y i
b .
( . )
(
)=
(
)
+
he nonparametric score function ( . ) does not have a compact closed form rep-
resentation. his means that graphical tools are required to visualise it.
Variable Selection
4.4
In this section we describe the procedure and the graphical tools for selecting the
variables of the SVM model used in forecasts. We have two very important model
accuracy criteria: the accuracy ratio (AR), which will be used here as a criterion for
model selection (Fig. . ), and the percentage of correctly classified out-of-sample
observations. Higher values indicate better model accuracy.
Model selection proceeds from the simplest (i.e. univariate) models to the one
with the highest AR. he problem that arises is: how do we determine the variable
that provides the highest AR across possible data samples? For a parametric model,
we would need to estimate the distribution of the coe cients at the variables and
therefore their confidence intervals. his approach, however, is practically irrelevant
for nonparametric models.
Instead we can compare models using an accuracy measure, in our case AR. We
first estimate the AR distributions for different models. his can be done using boot-
strapping (Horowitz, ). We randomly select training and validation sets, each of
which is a subsample of solvent and insolvent companies. Weuse a / ra-
tio since this is the worst case with the minimum AR. he two sets do not overlap -
they do not contain common observations. For each of these sets we apply the SVM
with parameters that provide the highest AR for bivariate models (Fig. . ) and esti-
mate the ARs. hen we perform a Monte Carlo experiment: we repeat this process of
generating subsamples and computing ARs times. Each time we will record the
ARs and then estimate their distribution.
Search WWH ::




Custom Search