Biology Reference
In-Depth Information
of structural similarities that become very limited when the sequence
identity levels drop below a certain threshold, usually around 30%.
Experience shows that, with decreasing identity levels, alignment errors
first appear in loop regions before affecting larger portions of the pro-
teins, and that it is close to impossible to properly align loops of low
sequence identity and unequal lengths, leaving it up to the later model-
building process to find adequate solutions for loop structures.
Furthermore, many protein structures sharing a sequence identity level
below 40% contain structurally nonconserved loops, even if they have the
same length. Therefore, it becomes apparent that even the alignment of
loops with identical length but completely different sequences has little
meaning in structural biology.
4.1.2. Model accuracy
The accuracy of a protein model is largely limited by the deviation of the
used template structure(s) relative to the experimental structure of the
target. This limitation is inherent to the method, since comparative mod-
els result from a structural extrapolation guided by a sequence alignment.
As shown by comparison of the experimentally elucidated structures,
there is a direct correlation between the sequence identity level of a pro-
tein pair and the deviation of the C
atoms of their common core. 36 It is
therefore generally accepted that the percentage of sequence identity
between target and template allows for a reasonable first estimate of the
model quality, and that the core C
α
atoms of protein models sharing 50%
sequence identity with their templates will deviate by approximately 1.0 Å
root mean square deviation (RMSD) from their experimentally eluci-
dated structures 36 ; this is roughly comparable to the accuracy of a
medium-resolution NMR-derived structure or a low-resolution X-ray
structure. 37,38 This has led to the definition of three broad classes of
model quality based on the level of identity of the core region common
to both target and template sequences. Firstly, models based on more
than 50% identity will yield high-accuracy models, where inaccuracies are
mostly restricted to side-chain packing and loop regions. Secondly, com-
parative models based on 30% to 50% sequence identity can be consi-
dered medium-accuracy models, where the most frequent inaccuracies
α
Search WWH ::




Custom Search