Biomedical Engineering Reference
In-Depth Information
reported in Table 1.1 , considering the sequence written horizontally, the last (the
rightmost) element would correspond to the symbol multiplying 21 0 , the last-but-
one element would correspond to the symbol multiplying 21 1 and so on. Moreover,
the first symbol (Gly) in the list of possible components (Table 1.1 ) would mean
number 1, the second (Ala) number 2 and so on. An empty position (no amino acid)
would mean number 0. This holds because, if any other amino acid would mean 0,
a sequence beginning with that amino acid would correspond to the same number
as the same sequence without the initial amino acid, and the correspondence would
not be biunivocal.
Example 1.5. The sequence Gly-Ser-Gly-Tyr, or, more precisely,
< no amino acid > < no amino acid > Gly Ser Gly Tyr
would then corresponds to the number 0 ... 0 1 3 1 20(or K) in base 21, that in base 10
is 20 21 0 . D 20/ C 1 21 1 . D 21/ C 3 21 2 . D 1323/ C 1 21 3 . D 9261/ D 10625.
The weights of all sequences up to molecular weight are therefore computed off-
line and stored in correspondence with the described natural numbers representing
the sequences. This computation may be done efficiently using smaller solutions to
gradually compute larger solutions. Note that more sequences may have the same
molecular weight; hence, one weight may correspond to more than one natural num-
ber, even if one natural number corresponds to only one sequence, hence to one
weight. The natural numbers may also be not stored, but simply be the indices of an
array memorizing the weights. This constitutes the weights database: given a molec-
ular weight, it allows to find almost instantaneously which are all the sequences of
components that could produce a portion of normalized peptide having that weight.
Va l u e is chosen big enough to cover all the possible gaps that one could need to
sequence in the set of current analyses.
Therefore, for each gap b hC1 b h , the set of all the possible subsequences
S.b hC1 b h / covering that gap is computed in extremely short times by search-
ing the weights database for all natural numbers corresponding to the weight
b hC1 b h , and by explicitly generating the subsequences corresponding to such
natural numbers.
When all the sets of subsequences S.b hC1 b h /; h D 0;:::;p are available, all
the possible sequences
S of the normalized peptide under the peak interpretation
can be generated with the concatenation of such sets in all possible ways, oper-
ation which we denote by ˚ , but eliminating sequences violating the requirements
regarding minimum m i
or maximum M i
value on the number of each component.
S D S.b 1 b 0 / ˚ S.b 2 b 1 / ˚˚ S. w 0 c a c 0 b p /
Finally, when considering the sets of all the possible sequences
f S 1 ; S 2 ;:::;
S r g
for all the possible models f 1 ; 2 ;:::; r g
of
F
, the complete set of all
S
possible sequences
of the normalized peptide is obtained:
S D S 1 [ S 2 [[ S r
Search WWH ::




Custom Search