Information Technology Reference
In-Depth Information
expressed by graph edges, linking referential vectors in antibodies. Thus, there
is a need to evaluate quality of the structure of the edges.
There is a number of ways to evaluate idiotypic model structure. In this paper
we present the one which we have found to be the most clear for interpretation.
This approach is based on the analysis of the edge lengths of the minimal span-
ning tree (MST) constructed over the set of antibodies, in each iteration of the
learning process.
4.3
Experimental Settings
The architecture of BEATCA system supports comparative studies of clustering
methods at the various stages of the process (i.e. initial document grouping,
initial topic identification, incremental clustering, graph model projection to
2D map and visualization, identification of topical areas on the map and its
labeling) - consult [13] for details. In this paper we focus only on the evaluation
and comparison of the immune models.
This study required manually labelled documents, so the experiments were
executed on a widely-used 20 Newsgroups document collection 4 of approxi-
mately 20 thousands newsgroup messages, partitioned into 20 different news-
groups (about 1000 messages each). As a data preprocessing step in BEATCA
system, entropy-based dimensionality reduction techniques are applied [12], so
the training data dimensionality (the number of distinct terms used) was 4419.
Each immune model have been trained for 100 iterations, using previously
described algorithms and methods.
4.4
Impact of the Time-Dependent Parameters
In the first two series of experiment, we compared models built with time-
dependent parameters σ s ( t )and σ d ( t ) with the constant, a priori defined values
of σ s and σ d . As a reference case we took a model where σ s ( t ) was changed from
the initial value 0 . 05 up to 0 . 25 and σ d ( t )from0 . 1upto0 . 4 (cf. section 3.3).
First, we compare the reference model and the four models with constant σ d .
Parameter σ s has been changed identically as in reference model. The values of
σ d varied from the starting value in the reference model (0 . 1) up to the final
value (0 . 4) by 0 . 1 step. The results 5 are presented in Figure 1.
Fig. 1(a) presents variance of the edge length in the minimal spanning tree built
over the set of antibodies in the immune memory in i th iteration of the learning
process. At first glance one can notice instability of this measure for high val-
ues of σ d . Comparing stable values, we notice that the variance for the reference
network has the highest value. It means that the idiotypic network contains both
short edges, connecting clusters of more similar antibodies and longer edges, link-
ing more distant antibodies, probably stimulated by different subsets of docu-
ments (antigens). Such meta-clustering structure is desirable and preferred over
networks with equidistant antibodies (and, thus, low edge length variance).
4 http://people.csail.mit.edu/jrennie/20Newsgroups/
5 All figures present average values of the respective measures in 20 contextual nets.
 
Search WWH ::




Custom Search