Information Technology Reference
In-Depth Information
Solving for
λ
and using
j
w
j
=1and
j
y
nj
=1forall
N
,weget
λ
=
n
m
(
x
n
)=
c
, which is the match count after
N
observations. As a result,
w
is after
N
observations by the principle of maximum likelihood given by
N
w
=
c
−
1
m
(
x
n
)
y
n
,
(5.82)
n
=1
Thus, the
j
th element of
w
, representing the probability of the classifier having
generated an observation of class
j
, is the number of matched observations of this
class divided by the total number of observations - a straightforward frequentist
measure.
5.5.3
Incremental Learning for Classification
Let
w
N
be the estimate of
w
after
N
observations. Given the new observation
(
x
N
+1
,
y
N
+1
), the aim of the incremental approach is to find a computatio-
nally ecient approach to update
w
N
to reflect this new knowledge. By (5.82),
c
N
+1
w
N
+1
is given by
N
+1
c
N
+1
w
N
+1
=
m
(
x
n
)
y
n
n
=1
N
=
m
(
x
n
)
y
n
+
m
(
x
N
+1
)
y
N
+1
n
=1
=(
c
N
+1
−
m
(
x
N
+1
))
w
N
+
m
(
x
N
+1
)
y
N
+1
.
(5.83)
Dividing the above by
c
N
+1
results in the final incremental update
c
−
N
+1
m
(
x
N
+1
)(
w
N
−
w
N
+1
=
w
N
−
y
N
+1
)
.
(5.84)
This update tracks (5.82) accurately, is of complexity
(
D
Y
), and only requires
the parameter vector
w
and the match count
c
to be stored. Thus, it is accurate
and ecient.
O
Example 5.9 (Classifier Model for Classification).
Figure 5.3 shows the data of
a classification task with two distinct classes. Observations of classes 1 and 2
are shown by circles and squares, respectively. The larger rectangles indicate the
matched areas of the input space of the three classifiers
c
1
,
c
2
,and
c
3
. Based on
these data, the number of matched observations of each class as well as
w
and
τ
are shown for each classifier in Table 5.2.
Recall that the elements of
w
represent the estimated probabilities of having
generated an observation of a specific class. The estimates in Table 5.2 show that
Classifier
c
3
is most certain about modelling class 2, while Classifier
c
2
is most
uncertain about which class it models. These values are also reflected in
τ
−
1
,
which is highest for
c
2
and lowest for
c
3
.Thus,
c
3
is the “best” classifier, while
c
2
is the “worst” - an evaluation that reflects what can be observed in Fig. 5.3.
Search WWH ::
Custom Search