The Central Dogma - Bioinformatics Computing

Biomedical Engineering Reference

In-Depth Information

Artificial neural networks and other forms of machine learning are only one form of pattern matching

that have application in bioinformatics. Other approaches to pattern matching include techniques

such as genetic algorithms, which work by identifying the best fit for a function that is used to select

future generations. Hybrid systems of artificial neural networks supplemented by genetic algorithms

rule-based expert systems, and conventional, algorithmic programming hold particular promise in

bioinformatics.

In addition to simulating neural networks, the numerical processing capabilities of a computer are

commonly used to simulate the interactions of various proteins and drugs at their active sites. Data-

mining applications include searching patterns of known gene structures for newly discovered

patterns. Search engines are similarly instrumental to uncovering patterns or key works in local or

online databases. Pairwise sequence comparison, based on either BLAST or Smith-Waterman

dynamic programming techniques, form the basis for most sequence alignment operations. Using

online tools based on BLAST, it's possible for anyone with a connection to the Internet to evaluate all

possible ways of aligning one sequence against another in a reasonable time, even though the

number of such possible alignments grows exponentially with the length of the two sequences.

Statistical analysis is an important component of searching and pattern matching, especially in

dealing with uncertainty. For example, in the multiple sequence alignment problem in Figure 1-15 ,

statistical methods can be used to determine the best alignment of the four polypeptide strings

consistent with an alignment score that rewards perfect matches, and penalizes for imperfect

matches and the number and length of the gaps introduced in the final sequence. Non-statistical

methods of multiple sequence alignment can be used as well.

Figure 1-15. Multiple Sequence Alignment Problem. The unaligned

polypeptide sequences are shown at the top of the figure, and the resultant

sequence with gaps is shown at the bottom. Statistical and non-statistical

methods can be used to identify the optimum position of gaps and the

relative location of the high-scoring polypeptide sequences, as a starting

point for evolutionary modeling, for example.

Search WWH ::

Custom Search

Home