Biomedical Engineering Reference
In-Depth Information
tion process, for a given appropriate binning location and binning window,
the bin could contain all peaks of a protein in spectra. These peaks in the
bin can be assigned a mass ID of the same protein.
As a powerful searching method, Genetic Algorithm (GA) has been
implemented in many search problems (see [11] for instance). Genetic
algorithms are randomized optimization methods that need minimal in-
formation on the problem to guide the search. They use a population of
multiple structures, each one encoding a tentative solution, to perform a
search from many zones of the problem space at the same time 12 . GA was
customized to search bins for mass spectra in [6]. Although it performs
very well, GA Binning (GAB) is a computation-intensive method and the-
oretically obtains only a local optimum solution in binning search because
the search space in GAB is incompact 18 . In [20], though it tried to avoid
binning the peaks in the data processing, the binning idea actually is pre-
sented by using a mean spectrum. The bin locations are determined by
the peaks in the mean spectrum. One of its drawbacks is that the mean
spectrum usually cannot represent the peak distribution of spectra in an
acurate manner. It is desired to nd a simple binning method, by which bin
locations are determined by two criteria: (1) the peaks selected in one bin
should meet the requirement for a certain signal to noise (S/N) ratio, and
(2) one bin combines only the peaks that dier by no more than a certain
clock tick or a certain relative mass. This binning method actually uses a
constant initial bin width. Some clustering techniques are applied in [5] to
determine a so-called center spectrum for a binning procedure. In the fol-
lowing, we present a new binning method, named the projecting spectrum
binning (PSB). This method mainly consists of two major steps: spectrum
projection and bin determination. Comparing PSB with GAB, the results
show that PSB bins peaks both eectively and eciently. Binning approach
reduces the dimension of data signicantly.
Given a mass window with window location and window width, the
peak frequency in the mass window for a given set of spectra is easy to be
calculated. Moving the mass window with a certain shifting unit from lower
mass to higher mass, we obtain a set of mass-frequency pairs (x;n), where
the mass x, can be the middle value of the mass window and the frequency
n, is the peak frequency of the spectra in the mass window. In other words,
if w(x) is the window width associated with the mass value x, then the
peak frequency f of the spectra can be expressed as f(x) = f(x;w(x)).
According to the assumptions mentioned above, it is obvious that a protein
would generate a peak in the mass-frequency spectrum and a peak in the
Search WWH ::




Custom Search