Biology Reference
In-Depth Information
protein-coding potential. 78,79 Much of such “noisy” expression indeed
appears to constitute long primary transcripts for the production of much
shorter mature RNAs, e.g. a circadianly expressed 3-kb human transcript
that seems to encode only one miRNA (has-mir-122).
ncRNA genes are subject to similar evolutionary processes as pro-
tein-coding genes, leading to gene duplication, diversification,
pseudogenization, and loss. In a similar manner to the analysis of pro-
tein orthology, the identification of losses of ncRNA genes which are
widely conserved among other organisms can be indicative of species-
specific traits.
Already, early comparisons of nucleotide sequences from divergent
vertebrate species identified highly conserved stretches in noncoding
regions of genes with more than 70% identity over more than 100
nucleotides — an unexpected observation considering the much shorter
and less specific sequence properties required for the binding of regula-
tory proteins. 80 The true extent of such conservation only became appar-
ent with the sequencing and comparison of multiple mammalian
genomes, where applying the same definition of conservation to
human-mouse alignments identified hundreds of thousands of such
CNSs constituting about 1%-2% of the genomes. 81 Interestingly, there is
a higher incidence of CNSs within gene-poor regions (so-called “gene
deserts”), and these sequences are not repetitive and do not share easily
identifiable sequence features. The term “noncoding” was suggested to
denote that there was no evidence of expression for many experimentally
and computationally scrutinized CNSs on human chromosome 21. The
hypothesis that CNSs merely represent regions with lower local mutation
rates (mutational “cold spots”) was recently rejected by the analysis of allele
frequency distributions from HapMap genotype data in humans, proving
that these CNSs are selectively constrained and therefore should be
functional. 82 A subset of these sequences comprises the so-called ultracon-
served elements that have remained mostly intact since the split of mam-
mals and chicken and even fish. 83 Hundreds of such deeply conserved
sequences have been experimentally tested within the framework of the
VISTA Enhancer project, and almost half of them showed tissue-specific
enhancer activity. Recent analysis of the opossum genome revealed
that 20% of eutherian CNSs appear to be recent inventions after the
Search WWH ::




Custom Search