Biomedical Engineering Reference
In-Depth Information
Ta b l e 1 . The substitution matrix describing mutational pressure in the leading DNA strand. A
nucleotide in the first column is substituted by a nucleotide in the first row.
ATGC
A 0 . 81 0 . 10 0 . 07 0 . 02
T 0 . 07 0 . 87 0 . 03 0 . 03
G 0 . 16 0 . 12 0 . 71 0 . 01
C 0 . 07 0 . 26 0 . 05 0 . 62
Ta b l e 2 . The substitution matrix describing mutational pressure in the lagging DNA strand. A
nucleotide in the first column is substituted by a nucleotide in the first row.
ATGC
A 0 . 87 0 . 07 0 . 03 0 . 03
T 0 . 10 0 . 81 0 . 02 0 . 07
G 0 . 26 0 . 07 0 . 62 0 . 05
C 0 . 12 0 . 16 0 . 01 0 . 71
with their potential pseudogenes found in intergenic regions of the B. burgdorferi chro-
mosome [3].
The mutated gene sequence was checked on account of two selectional assumptions:
1. appearance of stop translation codon inside the gene sequence,
2. strength of its coding signal.
We considered three standard stop codons (TAA, TAG, or TGA) whereas the coding
signal was calculated according to the gene finding algorithm (called PMC) based on
the theory of Markov chains [29, 30, 33]. This algorithm recognizes very efficiently
protein coding sequences from prokaryotic genomes and uses three independent homo-
geneous Markov chains to describe occurrence of nucleotides for each of three codon
positions in a given DNA sequence, separately. This method is based on specific cor-
relations in the nucleotide composition observed in the first, the second, and the third
codon positions of protein coding genes [34, 35]. In addition, this algorithm does not
require learning sets of large sizes therefore the nucleotide transition matrices used
by this method can be built on only a few coding sequences for its effective training
[30, 33]. The individual that nested the mutated gene was eliminated if at least one stop
codon was generated inside its sequence or if the sequence was not recognized by the
PMC algorithm as a protein coding sequence in the first reading frame. The eliminated
individual was replaced by another one chosen randomly from the population.
In our simulations, we have taken into account three different versions of the muta-
tional pressure acting on gene sequences. In the first possibility (direct pressure), the
genes from a given DNA strand (e.g. leading) were subjected to the matrix of the strand
on which they were lying (i.e. leading). In the second variant, we considered genes that
were under the pressure characteristic of the opposite strand (reverse pressure). Apart
from these two constant pressures, we also applied the changing pressure. In this case
the genes were subjected to the leading and lagging strand pressures that were switched
every 0 . 5 million MCS. Such simulation mimics the inversion of the gene in chromo-
some, in which the gene is translocated from one DNA strand to the other. We have
Search WWH ::




Custom Search