Information Technology Reference
In-Depth Information
basepairs. With 30,000 genes estimated in each human genome, there are roughly
three billion basepairs. A challenging task in computational biology deals with the
finding of the ''best'' alignment amongst a given set of sequences. This is denoted
as multiple sequence alignment (MSA). The domain of MSA is not limited to
DNA sequences. It could be for proteins [1] and could become even more complex
in sequence splicing problems [2].
The alignment of two sequences of length L can be solved via dynamic
programming in O(L 2 ) time. Extending it to N sequences, it will take O(L N ) time,
and in some cases the MSA problem is NP-complete [3]. Recent work includes
CLUSTLW, T-COFFEE, etc. In one of the pioneering papers by Chris Lee, he
demonstrates the advantages of applying a graph theoretical approach to the
MSA problem (partial ordered alignment); and many graph-based techniques
have been proposed since then as explained in [3-5], and [6]. Lee proposed the
partial-ordered multiple-sequence alignment (PO-MSA) and demonstrated its
application in pair-wise alignment, as well as its application in progressive pair-
wise alignment [4, 5]. The results are comparable or even faster than some of the
best known algorithms.
In this chapter, we study the complexities of the process of forming a
PO-MSAG. Assuming there are O(N) sequences of length O(L) that are already
aligned, it will take at least O(NL) time to simplify the sequences sequentially into
a PO-MSAG. Here, we present a set of fast and parallel algorithms to generate a
PO-MSAG for any given set of aligned sequences at the hardware architecture
level. The hardware can then be fabricated as a low cost chip that can be used as a
coprocessor with a more powerful processor.
We will consider two reconfigurable mesh architectures on which to imple-
ment our solution: the nanoscale spin wave and the microscale electrical VLSI. We
separate the problem into two cases: data sequences with a fixed amount of data
variations and those with an arbitrary amount up to O(N). In the fixed variations
case, the spin wave and the VLSI architectures can construct the PO-MSAG in
O(1) time, as shown in [7, 8]. As for arbitrary O(N) variations, the spin-wave
architecture will construct the PO-MSAG in O(1) time whereas the VLSI one
takes O(N) time. An example of an input sequence set is shown in Figure 14.1.
A - C - G - T - T - A - C - T
A - G - T - T - G - G - G - C - T
(unsorted)
A - C - G - T - T - A - * - * - C - T
A - * - G - T - T - G - G - G - C - T
(sorted)
Figure 14.1. Example of a partial-order multiple-sequence alignment.
 
Search WWH ::




Custom Search