Graphics Reference
In-Depth Information
Figure . . Accuracy ratios for univariate SVM models. Box plots are estimated based on random
subsamples. he AR for the model containing only the random variable K is zero
line depicts medians. he box within each box plot shows the interquartile range
(IQR), while the whiskers span to the distance of / IQR in each direction from the
median. Outliers beyond that range are denoted by circles.
Based on Fig. . , we can conclude that variables K (Debt Cover) and K (Inter-
est Coverage Ratio) provide the highest median AR, of around %. We also notice
that the variables K , K , and K yield very low accuracy: their median ARs do
not exceed . %. he model based on the random variable K has an AR of zero;
in other words it has no predictive powerwhatsoever. For the next step we will select
variable K , which was included in the best univariate model.
For bivariate models, we will select the best predictor from the univariate models
(K )and one of the rest that delivers the highest AR (K )(Fig. . ). his procedure
willberepeated foreach newvariable added.heARgrowsuntil themodelhaseight
variables, and then itslowlydeclines. Median ARsfor the modelswith eight variables
are shown in Fig. . . he forward selection procedure cannot guarantee that the
variables selected will provide the highest accuracy. However, since many of them
are highly correlated, we can expect that the selected variables capture most of the
information.
We have also conducted experiments with subsamples of observations. he
change in the median was extremely small ( - ordersof magnitude smaller than the
interquartile range).Asexpected,the interquartile range narrowed,i.e.thedifference
between themodelswith moresamples became morestatistically significant. hus,if
Search WWH ::




Custom Search