Objective Functions - Multiple Sequence Alignment Methods

Biology Reference

In-Depth Information

S 0 Þ¼

R i ;

R j Þ

R i ½

;

R j ½

S ix ;

S jx Þ

S 0 1 ;c

S 0 n;c

where d :

is the scoring function, R i [ x ] (also denoted

as S ix ) is the x th symbol of the i th row in the MSA, and y is the row

length. The scoring function d can also be considered as a matrix of

predefined scores, where each cell represents the score of aligning

the two corresponding symbols.

The following example illustrates the SP method in detail:

Example: Given the following sequences:

Σ x ,

Σ y ! R

l S 1 : ACCCGA

l S 2 : ACTA

l S 3 : TCCTA

and their alignment S 0 :

S 1

ACCCGA

S 0 ð

S 1 ;

S 2 ;

S 3 Þ¼

S 2 :

S 3 :

TCC

The SP score of this alignment is:

S 0 Þ¼½

;

Þþ

;

Þþ

;

Þþ½

;

Þþ

;

Þþ

;

þ½

;Þþ

;

Þþ

ð;

Þþ½

;Þþ

ð;Þ

þ½

;

Þþ

;

Þþ

;

Þþ½

;

Þþ

;

Þþ

;

Þ:

In practice, mismatch and gap penalty scores are negative values

and scoring a match between two gaps is ignored. In each step of

the alignment, the SP method calculates the scores of all pairs of

residues for every column, which increases the MSA algorithm

complexity by O ( n 2 ) where n denotes the number of sequences.

In aligning DNA/RNA sequences, the scoring schemes tend to

be more egalitarian and independent of the symbols; however,

protein sequence alignments require more sophisticated approaches

as amino acids can be divided into various functional classes based on

different similarity parameters. The two most popular score matrices

used for aligning protein sequences are the PAM and BLOSUM

matrices [ 4 ]. The motivating idea in developing scoring matrices for

Multiple Sequence Alignment Methods

Search WWH ::

Custom Search

Home