Information Technology Reference
In-Depth Information
resulted in invalid structures. Indeed, only a very limited number of modifica-
tions can be made on GP parse trees in order to guarantee the creation of
valid structures. The problem with this kind of system is that extremely effi-
cient search operators such as point mutation cannot be used. Instead, an
inefficient sub-tree swapping is used so that valid parse trees are always
produced. Nevertheless, no matter how carefully genetic operators are im-
plemented, there are obviously limits to what grafting and pruning can do,
and the search space in such systems can never be thoroughly explored.
1.1.3 Proteins
Proteins are linear, long strings of 20 different amino acids and they consist of
the immediate expression of the genetic information stored in DNA. This
means that the four-letter language of DNA is translated into the more com-
plex 20-letter language of proteins. Obviously, there must be some kind of
code (genetic code) to translate the language of the four nucleotides into the
language of 20 amino acids. In order to specify each of the 20 amino acids
there should be at least 20 DNA “words”. By using triplets of nucleotides
(codons) for each amino acid, 4 3 = 64 different three-letter “words” are pos-
sible. This is more than adequate to code for the 20 amino acids and, in fact,
most amino acids have multiple codons, as only three of the 64 codons code
for the instruction “stop synthesis”. There is also a codon for a “start synthe-
sis” instruction, but this codon also codes for methionine, one of the 20 amino
acids found in proteins. The genetic code is virtually universal, meaning that
all organisms on Earth with very few exceptions use the same codons to
translate the language of their genes into proteins (the genetic code is shown
in section 1.2.4, Figure 1.6).
Thus, the information for proteins is decoded triplet by triplet at a time
and expressed as linear sequences of amino acids. Although the amino acid
sequence of the protein reflects the sequence of the corresponding DNA
molecule, the protein has a unique three-dimensional structure and exhibits
unique properties. Because of the richer chemical alphabet of proteins, the
linear strings of amino acids fold in special ways giving each protein its
individual three-dimensional structure. This unique three-dimensional struc-
ture or tertiary organization of proteins, together with the vast chemical rep-
ertoire of amino acids, allows proteins to play numerous roles, amongst them
the role of biological catalysts or enzymes. In fact, proteins are the real workers
of the cell.
Search WWH ::




Custom Search