Biomedical Engineering Reference
In-Depth Information
MGQTVTTPLSLTLQHWGDVQRIASNQS.....
Sequence
MGQTVTTPL-S
M-GQTVTTPLS-L
All 11-mers
G-QTVTTPLSL-T
Q-TVTTPLSLT-L
T-VTTPLSLTL-Q
V-TTPLSLTLQ-H
T-TPLSLTLQH-W
....
Keep only
cleaved 11-mers
M-GQTVTTPLS-L
T-TPLSLTLQH-W
....
Keep only TAP
binding 9-mers
GQTVTTPLS
WGDVQRIAS
GDVQRIASN
....
Keep for each HLA
only MHC binding
9-mers
HLA B*7001
HLA A*0201
WGDVQRIAS
GQTVTTPLS
GDVQRIASN
WGDVQRIAS
....
....
Count epitopes
3
5
Fig. 1 Algorithm for SIR score computation. Each protein is divided into all nine-mers and the
appropriate flanking regions ( a ). For each ninemer a cleavage score is computed ( b ). We compute
for all nine-mers with a positive cleavage score a TAP binding score and choose only supra-
threshold peptides ( c ). The MHC binding score of all TAP binding and cleaved nine-mers is
computed ( d ). Nine-mers passing all these stages are defined as epitopes. We then compute the
number of epitopes per protein per HLA allele ( e ). The ratio between the number of predicted
epitopes and the parallel number in a random sequence with a similar amino acid distribution is
defined as the SIR score
distribution typical to viruses was 0.01 (i.e., 10 epitopes in 1,000 nine-mers), then
the SIR score of the sequence for HLA A*0201 would be 1.5 (15/10). The average
SIR score of a protein was defined as the average of the SIR scores for each HLA
allele studied, weighted by the allele's frequency in the average human population.
Search WWH ::




Custom Search