Information Technology Reference
In-Depth Information
Fig. 10. HMM topology used for transmembrane prediction. The fully green state cor-
responds to the cytoplasmic loop, blue state to the extracellular loop, and the fully or-
ange state to the core of the transmembrane region. The shaded states of green/orange
and blue/orange colors correspond to transmembrane regions nearer to the lipid bi-
layer on cytoplasmic and extracellular sides. Although positive-inside rule [40] applies
to the loop region thus characterizing cytoplasmic loops differently from extracellular
loops, no distinction has been made in this work between cytoplasmic and extracellular
loops. Hence, the topology shown on top reduces to that on the bottom, with just three
states.
best suited. Here, we considered an HMM with a simple architecture as shown
in Fig. 10. Each state is modeled with a mixture of 8 Gaussians. The vector of
wavelet coecients computed for scales 4 to 16 at each residue position in the
protein, is considered the feature vector corresponding to that residue. In Fig
9A, the feature vectors correspond to columns in the 2D image of wavelet coe-
cients, considering only rows 4 to 16. The data set used is the set of 160 proteins
[41]. The data set is available as 10 disjoint sets so that separate data may be
used for training and testing. We used the first set (numbered 0) for testing, and
the remaining sets for training. The accuracy of classification of each residue as
transmembrane or non-transmembrane is found to be 80.0% (
Q 2 refers to
the percentage of residues that have been classified correctly into the two states
transmembrane and non-transmembrane. Although hidden Markov models have
been used earlier towards transmembrane prediction, what is unique here is the
demonstration of the use of wavelet coecients as feature vectors. Within the
speech recognition framework, wavelets have traditionally been used for speech
enhancement (similar to hydrophobicity smoothing in case of transmembrane
prediction), but a recent paper has demonstrated the use of wavelet coecients
as features for phoneme classification [35].
Q 2 ).
3.6
Membrane Helix Boundary Prediction Using N-Gram Features
The above work on transmembrane helix boundary prediction using signal pro-
cessing techniques borrowed from language technologies strongly complements
other applications of language technologies to the same task. As with phoneme
identification, other language technologies applications use segmentation ap-
Search WWH ::




Custom Search