Drug Discovery and Development via Synthetic Biology - Synthetic Biology

Biology Reference

In-Depth Information

Currently, state-of-the-art sequence database search tools employ more sophisticated

algorithms to reduce computation time and increase sensitivity to weak similarities.

For example, tools such as SAM 22 and HMMER 23 employ hidden Markov models to identify

sequence homology. Perhaps the most widely used search tool is the Basic Local Alignment

Search Tool, or BLAST, first presented by Altschul and coworkers in 1990. 24 At its inception,

BLAST offered high-sensitivity database searching at speeds much faster than any previous

algorithm, and proved amenable to mathematical and statistical analysis. Subsequent

versions, such as gapped BLAST and position-specific iterative (PSI) BLAST, have further

improved computation time and sensitivity to weak, but still biologically relevant,

similarity. 25 Functional prediction via BLAST is further enhanced through coupling with the

Conserved Domain Database (CDD), 26 which integrates data from sources such as Pfam 27

and SMART 28 to identify regions of the query sequence with evolutionarily conserved

functions, such as binding a metal ion or cofactor. BLAST results can also be coupled to

tools such as GCView, 29 which enable analysis of the genomic context of search results

to facilitate more accurate functional prediction.

In recent years, predictive algorithms have expanded beyond individual coding sequence

queries to a variety of other targets. For example, IsoRankN enables the alignment of entire

protein

protein interaction networks for the prediction of functional orthologues across

species. 30 Tools such as PromPredict, 31 ConTra, 32 and RSAT, 33 among others, focus not on

protein sequences, but on the sequences of regulatory regions such as promoters and

transcription factor binding sites. Such tools have clear applications in synthetic biology,

not only for the design of biosynthetic pathways, but also for synthetic gene circuits

and signal transduction systems.

PATHWAY DISCOVERY, PREDICTION, AND ANALYSIS

The computational tools described above are generally applicable to any DNA or protein

sequence, and as a result have proven very useful for a wide range of applications. For the

synthesis of drugs and drug candidates, however, special attention is paid to those proteins

that are involved in secondary metabolism. This is due to the observation that secondary

metabolite natural products and their derivatives and analogues represent a substantial

fraction of the drugs available today. For example, in 2007 it was reported that 72.9% of

anticancer drugs and 68.9% of small molecule antiinfectives are natural products or derived

therefrom. 1 As a result, a number of tools have been developed for the discovery,

prediction, and analysis of secondary metabolite gene clusters ( Table 10.1 ).

185

Two classes of natural products that have garnered significant research interest are

polyketides and nonribosomal peptides, complex compounds that are synthesized

by multimodular, assembly line megasynthases known as polyketide synthases (PKSs)

and nonribosomal peptide synthetases (NRPSs), respectively. In the past 15 years, a number

of computational tools have been developed not only for the identification of PKS and

NRPS gene clusters from DNA sequence data, but also for the prediction of their

corresponding products. Some of the earliest efforts toward in silico prediction of NRPS

products focused on the specificity of adenylation domains. In 1997, de Crécy-Lagard and

coworkers examined 55 adenylation domain sequences to devise rules for specificity

prediction, but found that they could only come up with good predictions in 43% of

cases. 34 Two years later, however, analysis of the crystal structure of the adenylation domain

PheA involved in gramicidin S biosynthesis enabled two groups to provide much more

accurate specificity predictions. Stachelhaus and coworkers identified 10 specificity-

conferring residues, allowing 86% accuracy in specificity prediction. 35 Challis and coworkers

took a very similar approach, identifying an eight-residue signature sequence. 36 More

recently, a sophisticated prediction algorithm based on transductive support vector

machines has been devised that also incorporates the physicochemical properties of the

Synthetic Biology

Search WWH ::

Custom Search

Home