Biomedical Engineering Reference
In-Depth Information
By construction, the set of all the possible sequences
of the normalized peptide
is also the set of all the possible sequences of the real peptide under analysis; hence,
the sequencing problem have been solved.
Note that, in the case when the formula
S
is unsatisfiable, and a truth assign-
ment maximizing the number of clauses which evaluates to True has been found,
some gap may admit no subsequences because some incompatibility clauses are
not respected. A less reliable solution can in this case be obtained by merging each
unsequenceable gap with one of its neighboring ones (preferably the smaller).
Example 1.6. When considering the formula
F
of Example 1.4 with 108 variables,
4909 clauses and three models, computing the weights database with D 300 we
obtain three breakpoint successions, reported below together with all their corre-
sponding possible sequences:
f 87.0, 224.2, 339.2, 452.2, 565.2, 662.2 g which gives two sequences:
Ser-His-Asp-Leu-Leu-Pro-Gly-Leu
Ser-His-Asp-Leu-Leu-Pro-Leu-Gly
F
f 87.0, 224.2, 339.2, 452.2, 565.2, 678.3 g which gives two sequences:
Ser-His-Asp-Leu-Leu-Leu-Gly-Pro
Ser-His-Asp-Leu-Leu-Leu-Pro-Gly
f 87.0, 184.0, 355.2, 452.2, 565.2, 662.2 g which gives four sequences:
Ser-Pro-Gly-Asn-Pro-Leu-Pro-Gly-Leu
Ser-Pro-Gly-Asn-Pro-Leu-Pro-Leu-Gly
Ser-Pro-Asn-Gly-Pro-Leu-Pro-Gly-Leu
Ser-Pro-Asn-Gly-Pro-Leu-Pro-Leu-Gly
However, since in this series of examples we selected from the spectrum of Fig. 1.1
only the labelled peaks, results are not as accurate as it would be possible when
selecting more peaks.
1.7
Implementation and Results
The described approach is implemented in C CC . The initial input routine (1) reads
all informations about possible components and possible types of fragments and
charges and computes the weights database, and (2) reads the spectrum and extracts
from it all peaks above a certain value. After this, the logic formula
F
representing
the peak interpretation problem is generated. All models of
are then found by
means of the DPLL SAT solver BrChaff [ 6 ], modified in order to search for all
the models of the given formula. Then, for each model of
F
, the breakpoint
succession is computed, and all the possible subsequences covering each gap are
computed and linked together.
Those subsequences may be produced either by means of a specialized branching
algorithm working on-line, or by means of the weights database computed off-line
and used on-line. Finally, by considering the union of the set of sequences corre-
sponding to the different models of
F
F
, all the solutions of the sequencing problem
are obtained.
Search WWH ::




Custom Search