Biology Reference
In-Depth Information
prior experimental data, while ab initio methods provide valuable unbi-
ased information to complete the annotations. Dual or multiple genome
comparisons are informative when reliable DNA-level alignments of
orthologous genomic regions are available (e.g. from MULTIZ 5 or
VISTA 6 pipelines), as they can be directly used to measure selective con-
straints which facilitate the recognition of functionally important
sequences and motifs. As all gene prediction approaches have different
advantages and trade-offs, new strategies have been designed to weigh up
the sources of evidence and to combine the different gene predictions
into a set of consensus gene models, e.g. GLEAN, 7 GeneComber 8 and
JIGSAW. 9 Assessing the accuracy of the various approaches requires a
gold standard annotation set against which to measure predictive
performance in terms of sensitivity and specificity. Consistent bench-
marking of the major gene prediction methods has recently been
undertaken within the framework of the EGASP 10 (Fig. 1) and
NGASP (http://www.wormbase.org/wiki/index.php/NGASP) initiatives
(ENCODE and Nematode Genome Annotation Assessment Projects,
respectively).
Taking advantage of the wealth of evolutionary information inherent
in multi-species genomic comparisons can even challenge the gold stan-
dard of Drosophila melanogaster annotations, 2,3 despite the intensive
human expert curation and experimental validation over many years of
research. This type of comparative approach is even more critical for
identification of the normally relatively elusive noncoding RNAs, the
sequence conservation of which is fast diluting despite structural conser-
vation. Although the applicability of such DNA-level comparative
approaches may be limited for distantly related species, they are nonethe-
less becoming increasingly important as more genomic data become
available and a new generation of sequence alignment and prediction
tools emerges. The rapidly accumulating genome sequence data greatly
facilitate the discovery and accurate annotation of genes and other func-
tional genomic elements, while simultaneously highlighting the vital
importance of computational approaches. Although comparative studies
provide an expectation of the total gene count (see Fig. 3), any exact
number is likely to be unreliable, as exemplified by the fact that the
human gene count has been constantly and substantially revised over the
Search WWH ::




Custom Search