Information Technology Reference
In-Depth Information
3 Gene Libraries for Coverage
The most naïve way of looking at antibody creation is a way of covering a
multidimensional area (antigen space). This is somewhat complicated by the necessity
of avoiding self. Do evolved gene libraries improve such coverage? What about the
effect of different numbers of gene libraries? In order to answer these questions, we
evolved a number of different library configurations (see table 1) and tested them
using 8 bit r contiguous matching on antibodies/antigens of 32 bits.
Table 1. Configuration of gene libraries. We kept the number of antibodies and their size
(almost) constant in each case. Each row shows how we created these antibodies using a
combination of gene library segments, and how we changed the segment size and number of
genes per library in each case. Genome size is calculated as the sum of (#segments * size of
segment) for each library.
Number
libraries
Segments
in
Size of each
segment
Number
antibodies
Genome
size
each library
1
1089
32
1089
34848
2
33,33
16,16
1089
1056
3
11,11,9
11,10,11
1089
321
4
6,6,6,5
8,8,8,8
1080
184
For each of these different configurations a generational Genetic Algorithm (GA)
was run for 2000 generations. The GA had a population of size 128, used binary
tournaments to select parents, one point crossover with probability 0.7 and mutation
with a bit-wise probability of 1/genome_len. To assess the effect of random creation
in libraries we ran a parallel set of experiments with the bit-wise mutation probability
set to 50%. When performed with 1 library this is equivalent to classical random
creation without libraries.
Twenty five self sets of 128 proteins were created, each with a corresponding non-
self set of 1024 antigens, none of which exactly matched any of the self proteins.
These were used as the basis for the 25 runs of each algorithm. Individuals were
assessed by creating all of the possible antibodies encoded for (1080 or 1089 as
appropriate) and then removing those which were an 8 bit r-contiguous match to any
of the self set. The remaining antibodies (“detectors”) were used to assess the
coverage of the non-self set.
Figure 1 shows the coverage attained by the best-performing individual over 2000
generations (x-axis), averaged over twenty five runs. This illustrates how the use of
evolving gene libraries comprehensively outperforms random creation on this basic
task. Averaged over the last 500 generations, ANOVA, and by post-hoc testing using
Tamhane's T2 test (which does not assume equal variance) revealed that the
performance of the 2 libraries was best (98.14%) followed by 1 library (97.80%),
followed by 3 libraries (76.20%) and 4 libraries (56.97%). All results are significantly
different at the 95% confidence level.
Very similar results can be seen if we compare the average population coverage,
although interestingly in this case the use of 1 library gave the best result (97.8%)
compared to 97.0% coverage for 2 libraries, again statistically significant.
Search WWH ::




Custom Search