Information Technology Reference
In-Depth Information
The whole process of learning contextual model is to some extent similar to
hierarchical learning [11]. However, in our approach each constituent model, and
the corresponding contextual map, can be processed independently (particularly,
in parallel). Also a partial incremental update of such model appears to be much
easier to perform, both in terms of model quality, stability and time complexity.
The possibility of incremental learning stems from the fact that the very nature
of the learning process is iterative. So if new documents come, we can consider the
learning process as having been stopped at some stage and it is resumed now with
all the documents. We claim that it is not necessary to start the learning process
from scratch neither in the case that the new documents ”fit” the distribution
of the previous ones nor when their term distribution is significantly different.
This claim is supported by experimental results presented e.g in [13].
3
Immune Approach to Text Data Clustering
One of main goals of the BEATCA project was to create multidimensional doc-
ument maps in which geometrical vicinity would reflect conceptual closeness of
documents in a given document set.
In SOM algorithm, [14] each unit of an m
m grid contains so-called reference
vector v i , whose dimension agrees with the dimension of training examples. The
training examples are repeatedly presented to the network until a termination
criterion is satisfied. When an example x ( t )ispresentedattime t to the network,
its reference vectors are updated according to the rule
×
v i ( t +1)= v i ( t )+ α i ( t )
·
( x ( t )
v i ( t )) ,i =1 , ...,
|
m
|×|
m
|
(4)
where α i ( t ) is so-called learning rate varying according to the recipe: α i ( t )=
( t )
σ 2 ( t ) .Here ( t )and σ ( t ) are two user defined monotone decreas-
ing functions of time called, respectively, step size (or cooling schedule) and
neighborhood radius. The symbol d ( i, w ) stands for the distance (usually Man-
hattan distance) between i -th unit and so-called winner unit (i.e. the unit which
reference vector is most similar to the example x ( t )).
The main deficiencies of SOM are (cf. [1]): (a) it is order dependent, i.e. the
components of final weight vectors are affected by the order in which training
examples are presented, (b) the components of these vectors may be severely
affected by noise and outliers, (c) the size of the grid, the step size and the
size of the neighborhood must be tuned individually for each data-set to achieve
useful results, (d) high computational complexity.
GNG [9] uses the same equation (4) to update reference vectors but with
fixed learning rate α . Further its output is rather graph and not a grid. The
main idea is such that starting from very few nodes (typically, two), one new
node is inserted ever λ iterations near the node featuring the local local error
measurement. There is also a possibility to remove nodes: every λ iterations the
node with lowest utility for error reduction is removed. The main disadvantages
of GNG are (cf. [1]): (a) in comparison with SOM it requires larger number of
exp
d ( i,w )
·
 
Search WWH ::




Custom Search