Chemistry Reference
In-Depth Information
be generated from SMILES using the functions smiles _ to _ symbols
and smiles _ to _ bonds . These functions are shown in the Appendix
using both the FROWNS and OpenBabel toolkits. The following SQL pro-
duces a connection table in the form of an array of atom symbols and bond
orders and indices.
select smiles,smiles_to_symbols(smiles), smiles_to_bonds(smiles)
from nci.structure where cas = '1467-70-5';
smiles | smiles_to_symbols | smiles_to_bonds
----------------------+------------------------+----------------------------
c1cc(oc1)C(=O)C(=O)O | {C,C,C,O,C,C,O,C,O,O} | {{1,2,4},{2,3,4},{3,4,4},…}
The smiles _ to _ symbols and smile _ to _ bonds functions return
arrays of values. In the sample output above, the smiles _ to _ bonds
output has been truncated for easier viewing. Some client programs may
expect this information as separate rows, as if they were records in a file.
These arrays may be cast into that form by using a plpgsql function that
returns elements of an array as rows. This is shown in the next section.
11.5 Using Tables Instead of Files
in Client Programs
If an RDBMS is used to store molecular structures, this change requires
modifications to existing computer programs that read molecular struc-
ture files. The modifications are confined simply to the portions of the com-
puter programs that read and write files. These portions become functions
that use SQL to access an RDBMS. Chapter 5 introduced methods for client
programs to access data stored in RDBMS tables using SQL. This section
shows how an existing program that reads and writes molfiles can be read-
ily modified to use an RDBMS. First, however, it is necessary to describe
the schema and tables used to store molecular structures. It is important
that these tables can accommodate not only information from molfiles, but
also information from other molecular file formats in common use.
Consider the vla4 schema described above. It might be possible for the
client program to read the molfile data directly from the vla4.sdf file, but
the goal is to use the data in the vla4.structure and vla4.property
tables. Recall that these tables, or ones like them in another schema, could
have been created from files other than molfiles. These tables could also
have been populated with other client programs that no longer use files at
all, but instead store molecular structure data in RDBMS tables.
A traditional client program reads from a molecular structure file and
performs some computation that depends on the molecular structural data.
This read(file) function reads particular columns or fields from the file. A
different function would be necessary for each type of file format. A tradi-
tional client program can be modified to read molecular structure data from
Search WWH ::




Custom Search