Information Technology Reference
In-Depth Information
g1(x)
g2(x)
1
1
0.8
0.8
0.6
0.6
0.4
-4
0.4
-4
-2
-2
0.2
0.2
0
0
x1
x1
0
0
-4
2
-4
2
-2
-2
0
0
2
4
2
4
4
4
x2
x2
(a)
(b)
Fig. 4.5. Plots showing the generalised softmax function (4.22) for 2 classifiers with
inputs x =(1 ,x 1 ,x 2 ) T and φ ( x ) = 1, where Classifier 1 in plot (a) has gating para-
meters v 1 =0 . 7 and matches a circle of radius 3 around the origin, and Classifier 2 in
plot (b) has gating parameters v 2 =0 . 3 and matches all inputs
expectation of the classifier matching the corresponding input. Overall, matching
by a degree allows the specification of soft boundaries of the matched areas
which can be interpreted as the uncertainty about the exact area to match 4 ,
justified by the limited number of data available. This might solve issues with
hard classifier matching boundaries when searching for good model structures,
which can occur when the input space
is very large or even infinite, leading to a
possibly infinite number of possible model structures. In that case, smoothing the
classifier matching boundaries - as employed in Chap. 8 - makes fully covering
the input space with classifiers easier.
X
4.3.6
Training Issues
If each input is only matched by a single classifier, each classifier model is trai-
ned separately, and the problem of getting stuck in local maxima does not occur,
analogous to the discussion that follows in Sect. 4.4.3. Classifiers with overlap-
ping matching areas, on the other hand, cause the same training issues as already
discussed for the standard MoE model in Sect. 4.1.5, which causes the model
training to be time-consuming.
In the presented approach, LCS training is conceptually split into two parts:
training the model for a fixed model structure, and searching the space of possible
model structures. To do the latter, evaluation of a single model structure by
training the model needs to be ecient. Hence, the current training strategy is
hardly a viable option. However, identifying the cause for local maxima allows
for modifying the model to avoid those and therefore make model training more
ecient, as shown in the next section.
4 Thanks to Dr. Dan Richardson, University of Bath, for this interpretation.
Search WWH ::




Custom Search