Evolving Cellular Automata as Pattern Classifier - Cellular Automata

Information Technology Reference

In-Depth Information

The graph in Fig 7-9 plots the expected occurrence - EO(r,m) - denoted by

relation 4 in the y -axis, while the weight of patterns is plotted on x -axis as a

fraction of n - the number of bits in a pattern.

In Fig 7 , where m = 1, it is seen that the expected occurrence doesn't follow a

monotonically decreasing function but reaches the peak at slightly higher weight

value. However, as m value is increased, (for graph -8, m =2 ), the function

becomes monotonically decreasing.

The gradient becomes steeper as the value of m is increased further ( Fig 9 ).

In graph of Fig 9 which is plotted for different values of m keeping n ( = 30)

constant. It can be seen that the expectation of lower weight patterns occurring

in zero basin increases manifold.

5 Performance Analysis of MACA Based Classifier

For the sake of convenience of performance analysis, distributions of patterns in

two classes are assumed as shown in Fig.10 . Each pair of sets on whom classifiers

are run are characterized by the curves ( a−a , b−b , c−c , d−d ). The ordinate of

the curves represents number of pairs of patterns having the specified hamming

distance. For example, at point A (on the curve for a ) has y number of pairs

of patterns which are at hamming distance x . The abscissa has been plotted in

both direction, from left to right for Class I while from right to left for Class II .

The curves of Class I & II overlap if D min <d max . An ideal distribution a − a

is represented by the continuous line without any overlap of two classes.

In each distribution various values of n are taken. For each value of n , 2000

patterns are taken for each class. Out of this, 1000 patterns are taken from each

class to build up the classification model. The rest 1000 patterns are used to test

the prediction accuracy of the model. For each value of n , 10 different pairs of

pattern sets are built.

The Table 1 represents the classification e % ciency of data set a − a , b − b ,

c−c . Column II represents the different values of m (number of attractor basins)

for which GA finds the best possible solution. Column III to VI represent the

classification e % ciency of training and test data set respectively. Classification

e % ciency of training set is the percentage of patterns which can be classified in

different attractors while that of test data implies the percentage of data which

can be correctly predicted. The best result of classification e % ciency correspond-

ing to each m in the final generation is taken. This is averaged over for the 10

different pairs of pattern set taken for each value of n .

The following experiments validate the theoretical foundations of the classi-

fier performance reported in earlier sections.

5.1

Expt 1: Study of GA Evolution

The GA starts with various values of m . But it soon begins to get concentrated in

certain zone of values. The genetic algorithm is allowed to evolve for 50 genera-

tions. In each case 80% of the population in the final solution assumes the two or

Search WWH ::

Custom Search

Home