Information Technology Reference
In-Depth Information
( T +1)
T · ( t +1)
and m b ( t )= m 0 + m 1 −m 0
T
σ ( t )= σ 0 +( σ 1
t ,where σ 0 =0 . 05,
σ 1 =0 . 25 for σ s ( t ); σ 0 =0 . 1, σ 1 =0 . 4for σ d ( t ); m 0 =3, m 1 =1for m b ( t ).
σ 0 )
·
·
4
Experimental Results
In the following sections, the overall experimental design as well as quality mea-
sures are described. Since immune network can be treated both as a clustering
and a meta-clustering (clusters of clusters) model, beside commonly used clus-
tering quality measures (unsupervised and supervised), we have also investigated
immune network structure. The discussion of results is given in Sect. 4.3-4.7.
4.1
Quality Measures of the Clustering
Various measures of quality have been developed in the literature, covering di-
verse aspects of the clustering process. The clustering process is frequently re-
ferred as ”learning without a teacher”, or ”unsupervised learning”, and is driven
by some kind of similarity measure. The optimized criterion is intended to reflect
some esthetic preferences, like: uniform split into groups (topological continuity)
or appropriate split of documents with known a priori categorization. As the
criterion is somehow hidden, we need tests if the clustering process really fits
the expectations. In particular, we have accommodated for our purposes and
investigated the following well known quality measures of clustering [19,5]:
Average Document Quantization: average cosine distance (dissimilarity) for
the learning set between a document and the cell it was classified into.
This measure has values in the [0,1] interval, the lower values correspond re-
spectively to more ”smooth” inter-cluster transitions and more ”compact” clus-
ters. The two subsequent measures evaluate the agreement between the clustering
and the a priori categorization of documents (i.e. particular newsgroup in case of
newsgroups messages).
Average Weighted Cluster Purity: average ”category purity” of a cell (cell
weight is equal to its density, i.e. the number of assigned documents): AvgP urity =
1
|D|
n∈N max c (
), where D is the set of all documents in the corpus and
D c ( n ) is the set of documents from category c assigned to the cell n . Similarly,
Average Weighted Cluster Entropy measure can be calculated, where the D c ( n )
term is replaced with the entropy of the categories frequency distribution.
Normalized Mutual Information: the quotient of the entropy with respect
to the categories and clusters frequency to the square root of the product of
category and cluster entropies for individual clusters [5].
Again, both measures have values in the [0,1] interval. The higher the value
is, the better agreement between clusters and apriori given categories.
|
D c ( n )
|
4.2
Quality of the Immune Network
Beside the clustering structure represented by cells, idiotypic network should be
also treated as a meta-clustering model. Similarity between individual clusters is
 
Search WWH ::




Custom Search