Biology Reference
In-Depth Information
Box 4.1 Experimental Technologies for Identifying ProteineDNA InteractionsdCont'd
In vitro selection ('SELEX'):
What it provides:
l In vitro DNA-binding specificity data
How it works:
l A library of synthetic oligonucleotides, typically con-
taining ~10e25 bp of degenerate sequence, is synthe-
sized and incubated with a protein of interest under
conditions of mild to moderate stringency; protein-
bound DNA fragments are captured and amplified;
typically several rounds of selection and elution are
performed, sometimes entailing a mutagenic PCR
step(s) to further increase DNA sequence diversity, in
order to select for sequences bound with higher affinity;
bound sequences are identified by sequencing [48]
Throughput:
l
ORFs encoding proteins of interest are fused to the Gal4
activation domain (AD) ('prey'); yeast with integrated
DNA bait::receptor construct are transformed with
a plasmid encoding the prey, or mated with a yeast strain
carrying the prey construct; positive baiteprey interac-
tions are identified from colonies where reporter gene
activity is turned on: His3 expression is determined by
growth on media lacking histidine and containing the
competitive His3 inhibitor 3-aminotriazole and LacZ
induction is measured by a colorimeteric (blue/white)
assay
Throughput:
l
When collections of TFs are used in an array format, the
assay is proteome-scale. The coverage for TF collections
for C. elegans is ~90% (834 out of 937 predicted TFs)
[52] , for human it is ~70% (988 out of 1434) [26] , for
Drosophila it is ~78% (588 out of 755) [53] and for Ara-
bidopsis it is ~40% (645 out of 1500) [24]
Typically one protein is assayed at a time
Resolution and quantitative nature of data depend on
complexity of initial oligonucleotide library, stringency of
selection, and depth of DNA sequencing
Advantages:
l Complexity of initial oligonucleotide library can be quite
high
Disadvantages:
l
l
Multiple DNA baits can be processed simultaneously
Advantages:
l Interactions are tested in a eukaryotic organism (albeit in
a heterologous context) independent of native conditions
or native TF expression levels; as a result the assay is less
condition dependent.
l Can be done at various levels of scale and throughput,
from a single DNA fragment with one TF using standard
molecular biology lab resources to multiple DNA frag-
ments and large arrays of TFs with sophisticated robotic
and computational tools
Disadvantages:
l
l
If not careful, one can over-select and obtain primarily
high-affinity binding sequences and lose information on
lower-affinity sequences
Need relatively large amounts of active, purified protein
l
Yeast one-hybrid (Y1H):
What it provides:
l Interaction between a DNA fragment and a protein,
which can be either a larger (~2 kb) genomic fragment or
a short (6e20 bp, often multiple copies) DNA sequence
such as a putative CRE [49e51]
How it works:
l
DNA fragments are not tested in their native chromo-
somal context
Not yet suitable for heterodimers
l
Does not provide information on in vivo interactions with
other proteins or molecules unless specifically tested
l
DNA fragments of interest ('DNA baits') are cloned
upstream of two reporter genes (typically HIS3 and LacZ);
the DNA bait::receptor constructs are integrated into the
genome of a Y1H yeast strain into a mutant marker locus;
TF collections are available for only some organisms and
are not yet complete
l
as either homotypic or heterotypic TF DNA-binding site
clusters within CRMs [54] , whereas in E. coli and S. cer-
evisiae transcriptional regulation is often effected through
single or pairs of TF DNA-binding sites [55,56] . Typical
CRMs are ~200 e 1000 bp in length and contain one or
more DNA-binding sites for one or more TFs that activate
or repress the expression of target gene(s) [57] . Identifi-
cation of tissue/cell-type-specific CRMs (either enhancers
or silencers) remains a significant challenge and has been
the focus of many computational and experimental studies
(see below).
Computational methods for CRM prediction are
currently based on the model that CRMs are essentially
independent,
clusters of TF-binding sites [58 e 60] . These methods
typically favor the identification of evolutionarily
conserved sequences [43,61 e 63] . Combinations of TF
DNA-binding site motifs, or higher-order requirements on
the arrangement (spacing, order, orientation) of the motifs
relative to each other, that are associated with particular
gene expression output patterns are often referred to as
'cis-regulatory codes'.
One experimental strategy to identify enhancers has
been to test conserved non-coding sequences for reporter
activity in tissues of interest. For example, an enhancer trap
of 1 Mb surrounding the gene Sonic Hedgehog (Shh),
followed by testing of non-coding elements conserved
between mouse and human, resulted in the identification of
functional
regulatory units composed of
Search WWH ::




Custom Search