Information Technology Reference
In-Depth Information
Fig. 14.3 Overall prediction performance of three feature selection methods: LASSO, GL, and
SGL. Left prediction performance in AUC score on test sets over 20 random subsampling trials
(train:test
=
70:30%.) Right the corresponding number of selected features
14.3.3.2 Probabilistic Prediction
In logistic regression, the probability that an example x i will have the label “1” is
modeled by the logistic function
1
y i
x i
P
(
=
1
|
) =
+ ʲ 0 ) } ∈[
0
,
1
] ,
1
+
exp
{− ( ʲ
T x i
ʲ 0 are coefficients estimated during training. Note that this function
always returns a value between zero and one. That is, given
ʲ
where
and
ʲ 0 , logistic regres-
sion provides each test point with a probabilistic outcome in addition to a binary
prediction. This makes a clear distinction to other classification methods such as the
support vector machines [ 3 , 21 ].
For two logistic regression classifiers with similar binary prediction performance
(for example, in terms of AUC scores), a method that gives higher probability for
correct predictions would be arguably preferred in practice, since it provides higher
confidence on its predictions.
Figure 14.4 compares such probability values for the three feature selection
methods LASSO, GL, and SGL. The x-axis shows the indices of test examples
(in a test set created by random subsampling), while the y-axis shows the probability
values we discussed above. The circles show the true labels, 0 or 1. The probability
outcomes from each algorithm are connected by lines only for visual distinction,
without any other implication. The decision probability ( P
ʲ
and
(
y
=
1
) =
0
.
5) is shown
as a horizontal line.
As we can see, GL and SGL provided higher values of probability outcomes for
correct labels ( P
), at least for this particular test
set. The characteristics of GL and SGL were similar: on 11th example GL provided
slightly higher probability than SGL, and both misclassified 16th, 22nd and 23rd
examples that were classified correctly by LASSO.
(
y
=
1
)
or P
(
y
=
0
) =
1
P
(
y
=
1
)
 
Search WWH ::




Custom Search