Introduction - Protein Homology Detection Through Alignment of Markov Random Fields

Information Technology Reference

In-Depth Information

Fig. 1.2 Illustration of sequence-sequence, sequence-profile and prole-prole comparison

methods for homology detection and fold recognition. A node represents a protein and the distance

between two nodes represents their closeness. The large circles in red, blue and green indicate

three different protein families with similar fold. In this figure, two proteins marked with 1 belong

to the same protein family, so their homologous relationship can be detected through sequence-

sequence comparison. Two proteins marked with 2 are not in the same protein family, but they are

still evolutionary related and their homologous relationship can be recognized by sequence-profile

comparison. Two proteins marked with 3 are distantly-related with similar folds, and their

relationship may be recognized by pro le-pro le comparison

Sequence-sequence or pure sequence-based methods detect homologs by mainly

aligning two primary sequences. They are good for close but not remote homology

detection. Existing sequence-based methods mainly differ in alignment algorithms,

amino acid mutation score and gap penalty. Some methods such as the Needleman-

Wunsch [ 35 ] and Smith-Waterman algorithms [ 36 ] employ dynamic programming

to build alignments, while others such as BLAST [ 37 ] and FASTA [ 38 ] use more

ef

cient heuristic-based alignment algorithms. BLOSUM [ 39 ] and PAM [ 39 ] are

two widely-used amino acid substitution matrices to score similarity of two aligned

residues. An af

ne function is used to penalize gaps (i.e., unaligned residues) in an

alignment.

Alignment-based homology detection can be improved by using evolutionary

information such as PSI-BLAST sequence pro

le Hidden Markov

Model (HMM) [ 6 ]. A few methods have been developed to align one primary

sequence to one sequence pro

le [ 37 ] or pro

les. For example,

HMMER [ 40 ] and SAM [ 41 ] are two tools that align one primary sequence to one

le or align two sequence pro

Search WWH ::

Custom Search

Home