Biology Reference
In-Depth Information
Chapter 2
The Coding and the Non-coding
Transcriptome
Roderic Guig ´ 1 , 2
1 Centre de Regulacio´ Geno`mica Universitat Pompeu Fabra, Dr Aiguader, 88, E-08003 Barcelona, Catalonia, Spain
2 Departament de Cie`ncies Experimentals i de la Salut, Universitat Pompeu Fabra, Dr Aiguader, 88, E-08003 Barcelona, Catalonia, Spain
Chapter Outline
Introduction
27
Assessing the Reference Transcriptome
33
The Pathway from DNA to Protein Sequences
28
The Human Transcriptome
33
Methods to Determine the Reference Transcriptome
29
The Number of Human Genes
33
Experimental Methods
29
Human Genome Reference Gene Sets
34
Random cDNA Cloning
29
The Protein-coding Transcriptome
35
DNA Microarrays
30
The Long Non-coding RNA Transcriptome
36
RNASeq
30
The Expressed Transcriptome
37
Computational Methods
31
The Small RNA Transcriptome
38
Integrated Computational Gene Prediction
32
Conclusions and Future Challenges
38
The Use of Chromatin Marks
33
References
38
INTRODUCTION
The unfolding of the instructions encoded in the genome is
triggered by the transcription of DNA into RNA, and the
subsequent processing of the resulting primary RNA tran-
scripts into functional mature RNAs. The population of all
RNAs in the cell, the so-called transcriptome, is therefore the
first phenotypic manifestation of the genome, and at the same
time, the determinant of the higher-order phenotypes of the
cell, necessarily mediating all phenotypic changes at the
organism level caused by changes in the DNA sequence. Until
very recently it was assumed that the main role of RNA was
merely that of a messenger, mediating the transfer of infor-
mation from DNA to proteins. These were assumed to be the
main effectors of biological function. Indeed, proteins are
essential parts of organisms and participate in virtually every
process within cells. They catalyze the biochemical reactions
in metabolic pathways, and also have structural and mechan-
ical functions. In recent years, however, a plethora of novel
RNA species have been discovered in all organisms. Some of
these species correspond to novel splice forms
to novel families of small [3
7] or longer multi-exonic non-
e
coding RNAs [8
12] . Although the precise role of the vast
majority of these species is yet unknown, many seem involved
in the regulation of gene expression. It is thus becoming clear
that RNA plays a complex, multifaceted role in cellular
homeostasis, rather than merely serving as a carrier of infor-
mation from DNA to proteins.
Traditional methods for global transcriptome charac-
terization, such as EST sequencing and DNA microarrays,
have recently been complemented by deep RNA sequencing
(RNASeq) usingmassively parallel sequencing instruments.
Profiling of RNA by RNASeq, across multiple conditions
(cell types, species, individuals, cellular compartments,
RNA classes, perturbations, etc.) is revealing a eukaryotic
transcriptome of unanticipated complexity. Combined with
the profiling of other players involved in RNA synthesis and
processing (DNA methylation status, chromatin structure
and modifications, transcription factors, RNA polymerase,
splicing regulators, etc.,
e
also greatly facilitated by
massively parallel sequencing) we can now obtain a global,
holistic view of the transcriptional activity of the genome
within the cell, allowing for the first time a systems biology
e
often of
e
extraordinary complexity
of known protein-coding genes
[1,2] , but many do not appear to code for proteins and belong
e
 
 
 
 
Search WWH ::




Custom Search