Information Technology Reference
In-Depth Information
The graph in Fig 7-9 plots the expected occurrence - EO(r,m) - denoted by
relation 4 in the y -axis, while the weight of patterns is plotted on x -axis as a
fraction of n - the number of bits in a pattern.
In Fig 7 , where m = 1, it is seen that the expected occurrence doesn't follow a
monotonically decreasing function but reaches the peak at slightly higher weight
value. However, as m value is increased, (for graph -8, m =2 ), the function
becomes monotonically decreasing.
The gradient becomes steeper as the value of m is increased further ( Fig 9 ).
In graph of Fig 9 which is plotted for different values of m keeping n ( = 30)
constant. It can be seen that the expectation of lower weight patterns occurring
in zero basin increases manifold.
5 Performance Analysis of MACA Based Classifier
For the sake of convenience of performance analysis, distributions of patterns in
two classes are assumed as shown in Fig.10 . Each pair of sets on whom classifiers
are run are characterized by the curves ( a−a , b−b , c−c , d−d ). The ordinate of
the curves represents number of pairs of patterns having the specified hamming
distance. For example, at point A (on the curve for a ) has y number of pairs
of patterns which are at hamming distance x . The abscissa has been plotted in
both direction, from left to right for Class I while from right to left for Class II .
The curves of Class I & II overlap if D min <d max . An ideal distribution a − a
is represented by the continuous line without any overlap of two classes.
In each distribution various values of n are taken. For each value of n , 2000
patterns are taken for each class. Out of this, 1000 patterns are taken from each
class to build up the classification model. The rest 1000 patterns are used to test
the prediction accuracy of the model. For each value of n , 10 different pairs of
pattern sets are built.
The Table 1 represents the classification e % ciency of data set a − a , b − b ,
c−c . Column II represents the different values of m (number of attractor basins)
for which GA finds the best possible solution. Column III to VI represent the
classification e % ciency of training and test data set respectively. Classification
e % ciency of training set is the percentage of patterns which can be classified in
different attractors while that of test data implies the percentage of data which
can be correctly predicted. The best result of classification e % ciency correspond-
ing to each m in the final generation is taken. This is averaged over for the 10
different pairs of pattern set taken for each value of n .
The following experiments validate the theoretical foundations of the classi-
fier performance reported in earlier sections.
5.1
Expt 1: Study of GA Evolution
The GA starts with various values of m . But it soon begins to get concentrated in
certain zone of values. The genetic algorithm is allowed to evolve for 50 genera-
tions. In each case 80% of the population in the final solution assumes the two or
 
Search WWH ::




Custom Search