Biology Reference
In-Depth Information
pathways, and exploits computational and theoretical design tools to maximize the
efficiency of the designed pathways.
Several databases containing enzymes and enzymatic reactions have been used as the
platform for the computational design of novel pathways. These databases include
but are not limited to: Kyoto Encyclopedia of Genes and Genomes (KEGG); 11 BRaunschwig
ENzyme DAtabase (BRENDA); 12 MetaCyc; 13 the University of Minnesota Biocatalysis/
Biodegradation Database (UM-BBD); 14,15 Retro-Biosynthesis Tool (ReBiT); 16 and the
Universal Protein Resource (Uniprot). 17 With these databases, it is possible to identify a
staggering number of different pathways for synthesis of the same target compound. Thus, a
major challenge in computational design is to rank candidate pathways to predict which
pathway will have the highest yield of the desired compound.
The Biochemical Network Integrated Computational Explorer (BNICE) designs pathways
via matrix operation and evaluates candidate pathways by their thermodynamic
properties. 12,17,18 The KEGG LIGAND database serves as the enzyme library, and enzymes
involved in pathway design are selected by the first three levels of their EC numbers. By this
method, the range of enzymes involved is broad because they are not substrate specific
but functional group specific instead. Then the Gibbs free energy of each individual reaction
is calculated and unfavorable pathways are eliminated. By employing the BNICE framework,
over 400 000 theoretical novel biochemical pathways from the substrate chorismate to
an aromatic amino acid were discovered. 18 Thousands of novel linear polyketide structures
have also been explored using similar approaches. 19
The database ReBiT provides enzyme and enzymatic reaction information, but it is also
used as a tool for pathway design. Similar to BNICE, ReBiT also utilizes functional groups
of enzymes for enzyme assignment. The user inputs a target compound and then the
program identifies the functional groups of enzymes, based on the EC number, that could
theoretically produce that compound. Suggestions on which specific enzymes could be
involved and the related cofactors are also displayed in the selection step of each reaction. 16
46
The UMBBD-Pathway Prediction System (UM-PPS) applies natural precedents to eliminate
infeasible steps in predicting plausible biodegradation pathways. 15 This prediction system
incorporates the UM-BBD as its database. 21 It allows the user to input the chemical structure
of a substrate, and the functional groups are analyzed. Then atom-to-atom mapping is
implemented until all possible resulting compounds are displayed. Different from ReBiT,
users select compound intermediates instead of enzymes and the operation iteratively
repeats until a complete pathway concludes on the desired product. A series of other
restrictions and rules for pathway design have been implemented as well. One restriction is
to rank UM-PPS results into five aerobic likelihood groups from very likely to very unlikely,
thus reducing the number of predictions. 20 Another rule focuses on specific functional
groups in a molecule that almost invariably have the potential for the desired reaction, and
thus extracts a set of priority rules. 21 These restrictions reduce the number of predicted
products based on existing known pathways and avoid false-positive compounds which do
not exist in nature. The UM-PPS tool has been used to identify novel routes for
biodegradation of multiple substances, and some of the predicted pathways matched the
ones identified by human expert predictions. 14
A recently developed framework by Cho et al. employs a retrosynthesis model to generate
pathways and evaluates them by a prioritization scoring algorithm. 22 In this framework,
two databases are established: one is a reaction rule database and the other is a binding
site rule database. Fifty reaction rules were developed, and binding site rules were generated
based on the functional groups. It was claimed that 81% of the enzymes present in the
KEGG database are covered in these new databases, enabling a near-comprehensive analysis.
Five factors including binding site covalence, chemical similarity, thermodynamic
Search WWH ::




Custom Search