Biology Reference
In-Depth Information
CRE can refer either to individual TF-binding sites, or to
a collection of TF-binding sites clustered within a broader,
contiguous stretch of sequence such as a CRM ( Fig-
ure 4.1 C). CRMs that activate gene expression are
frequently referred to as transcriptional enhancers, while
those that repress gene expression are referred to as tran-
scriptional silencers. Classically, enhancers were defined as
being able to activate gene expression regardless of distance
from the transcription start site (TSS), location upstream or
downstream of TSS, or orientation [30] . However, it has
been foundmore recently that such activating elements often
are not completely independent of location and orientation
(indeed, such effects on CRM activity are often not even
tested); instead, the term 'enhancer' is nowmore commonly
used to refer to an activating CRM comprising a cluster of
TF DNA-binding sites [30] . Other types of DNA elements,
most notably gene promoters (see below) and insulators,
also contribute to the regulation of gene expression. Insu-
lators are DNA elements that, when placed between an
enhancer and a promoter, can block the activating role of the
enhancer on gene expression. In addition to such enhancer-
blocking insulators, there are also barrier insulators, which
block the spread of heterochromatinization and subsequent
gene silencing [31] .
In microbial genomes, such as Escherichia coli and
Saccharomyces cerevisiae, CREs typically occur within
a few hundred base pairs upstream of TSSs. Such upstream
regions where regulatory elements occur are often loosely
referred to as 'promoter regions'. These regions can be
classified into: 1) basal promoters that harbor core elements
that are bound by general TFs and RNA polymerase II; and
2) 'proximal promoters', which is a generic term that refers
to the sequence around 1 kb upstream of the TSS. More
complex eukaryotes additionally display regulation of
promoter activity by distal regions, such as enhancers that
can be located tens of kilobases upstream of TSSs. Various
technologies, including chromatin immunoprecipitation
(ChIP) (see Box 4.1 ) of RNA polymerase II [32] ,5 0 RACE
( r apid a mplification of c DNA e nds) [33] ,5 0 CAGE ( c ap-
a nalysis of g ene e xpression) [34] , and RNA-seq [35] have
been used to identify 5 0 ends of transcripts, and promoters
are inferred as the sequence that is located immediately
upstream of these 5 0 ends.
TF-Binding Sites
The identification of the DNA-binding sites of TFs (typi-
cally ~6 e 15 bp in length) is of fundamental importance
for the understanding of systems-level gene regulation.
Various computational and experimental approaches (see
below, and Box 4.1 and Box 4.2 ) have been developed to
identify TF DNA-binding sites (either genomic sites or in
vitro TF-DNA sequence specificities) and regulatory DNA
motifs (here 'motif' refers to a computational model
representing a set of similar sequences that putatively
share a functional role, for instance by interacting with
a particular TF). Computational (or in silico) approaches
include searching non-coding genomic sequences
upstream of co-expressed genes and genome-wide
searches for over-represented phylogenetically conserved
DNA sequence motifs [36 e 38] . The underlying hypoth-
esis in such studies is that the non-coding regions' co-
expressed genes are more likely to be bound and regulated
by similar TF(s), and that such TF-binding sites have been
conserved through evolution. Once TF-binding sites or
putative CRMs have been identified, computational algo-
rithms can be applied to search a genome, in particular
non-coding regions, for matches to such sequences ( Box
4.2 ). However, because these sites are short, many
sequence matches occur by chance alone and may not
actually serve a regulatory function. Furthermore, many
such sites in eukaryotic genomes are not available for TF
binding in a particular cell type or time point because they
are occluded by nucleosomes. To address these challenges,
more advanced computational algorithms have been
developed that score genomic sequences in a weighted
manner according to sequence accessibility [39] .
Cis
-Regulatory Modules (CRMs)
In the yeast S. cerevisiae, DNA-binding sites of regulatory
TFs typically occur within ~600 bp upstream of genes and
effective computational methods exist for mapping TFs and
their associated DNA-binding site motifs to their target
genes ( Box 4.2 ) [40 e 42] . In contrast, metazoan CREs can
be located far from the TSSs of the genes they regulate [43] .
Moreover, in metazoans, regulatory motifs tend to co-occur
parameterized using protein-DNA-binding measurements and constrained by reporter gene assays in which the EE was used to drive b -galactosidase
expression in the lens. (Top left) Schematic of the two-site model. (Top right) Biophysical function describing fraction of maximal EE activation (F),
dependent on protein concentrations ([A], [P]), protein-DNA-binding constants (K A ,K P ), and protein-protein binding constants (K AP ,K AA ). (Lower left
panel) Modeling of EE activation vs. Prep1 concentration (log 2 scale) for wild-type EE (i.e., L1L2) modeled with synergy between DNA-bound Prep1
molecules (black solid line) or without synergy (black dashed line), single-site mutations ( D L1L2, L1 D L2, gray dashed line) or high-affinity site
mutations ( D L1L2*, L1 D L2*, gray solid line; L1*L2*, blue solid line; * indicates high-affinity site mutant). (Lower right panel) Ratio of activation levels
for reporter constructs modeled in the left panel to facilitate the comparison of model predictions and ratio of relative EE reporter levels. Estimates of the
reporter level ratios (e.g. L1* D L2/L1 D L2 > 2, L1L2/L1* D L2 > 4; L1*L2*/L1L2 < 1.3) provide constraints on the Prep1 concentation ([A] in model) and
allowed estimation of physiological concentration of Prep1 (shaded red rectangle). Model correctly predicted greatest variation between native EE (L1L2)
and mutant with high-affinity sites (L1*L2*) would occur at lower Prep1 concentrations (see red curve) that occur earlier in development. (Adapted from
[29] with permission from Cold Spring Harbor Laboratory Press).
Search WWH ::




Custom Search