Information Technology Reference
In-Depth Information
This implies that operation 2 can be used to rearrange sequences in a DNA
molecule, thus accomplishing gene unscrambling.
The above operations are similar to the the “splicing operation” introduced
by Head [3] and circular splicing and mixed splicing [4, 14-16, 21]. It was
subsequently shown that some of these models have the computational power
of a universal Turing machine [1, 13, 22]. (See Head et al. [5] for a review.)
The process of gene unscrambling entails a series of successive or possibly
simultaneous intra- and intermolecular homologous recombinations. This is
followed by excision of all sequences
τ
s
y
τ
e
, where the sequence
y
is marked
by the presence of telomere addition sequences
τ
s
for telomere “start” (at its
5' end), and
τ
e
for telomere “end” (at its 3' end). Thus, from a long sequence
u
τ
s
y
τ
e
v
, this step retains only
τ
s
y
τ
e
in the macronucleus. Last, the enzyme
telomerase extends the length of the telomeric sequences (usually double-
stranded
TTTTGGGG
repeats in these organisms) from
τ
s
and
τ
e
to protect
the ends of the DNA molecule.
We now make the assumption that, either by a structural alignment of the
DNA or by other biochemical factors, the cell decides which sequences are
non-protein-coding (IESs) and which are ultimately protein coding (MDSs),
as well as which are the pointers
x
. Such biological shortcuts are presumably
essential to bring into proximity the pointers
x
. Each of the
n
MDSs, denoted
primarily by
n
, is flanked by the pointers
x
i
−
1
,i
and
x
i,i
+
1
. Each
pointer points to the MDS that should precede or follow
α
i
,1
≤
i
≤
α
i
in the final sequence.
The only exceptions are
α
1
, which is preceded by
τ
s
, and
α
n
, which is followed
by
τ
e
in the input string or micronuclear molecule. Note that, although present
generally once in the final macronuclear copy, each
x
i,i
+
1
occurs at least twice
in the micronuclear copy: once after
α
i
and once before
α
i
+
1
.
k
does not oc-
cur in the final sequence. Thus, since unscrambling leaves one copy of each
x
i,i
+
1
between
k
an internal sequence that is eliminated;
We denote by
k
x
i,i
+
1
or
x
i
−
1
,i
k
, depending on which pointer
x
i,i
+
1
is eliminated. Similarly, an MDS is
technically either
α
i
and
α
i
+
1
, an IES is nondeterministically either
α
i
x
i
+
1
or
x
i
−
1
,i
α
i
. For this model, either choice is equivalent.
The following example (from Landweber and Kari [8]) models unscrambling
of a micronuclear gene that contains MDSs in the scrambled order 2-4-1-3 using
only the operation of linear/circular recombination:
{
ux
12
α
2
x
23
1
x
34
α
4
τ
e
2
τ
s
α
1
x
12
3
x
23
α
3
x
34
v
}⇒
{
ux
12
3
x
23
α
3
x
34
v,
•α
2
x
23
1
x
34
α
4
τ
e
2
τ
s
α
1
x
12
}
={
ux
12
3
x
23
α
3
x
34
v,
•
1
x
34
α
4
τ
e
2
τ
s
α
1
x
12
α
2
x
23
}⇒
{
ux
12
3
x
23
1
x
34
α
4
τ
e
2
τ
s
α
1
x
12
α
2
x
23
α
3
x
34
v
}⇒