Chemistry Reference
In-Depth Information
discussed in Chapter 13. There are two perl scripts shown here: one to load
a SMILES file and one to load an sdf file. Each creates a new schema named
by the user. Each creates a table named structure containing SMILES, iso-
meric SMILES, canonical SMILES, names, and fingerprints within that
schema. The sdfloader function creates two additional tables named
sdf and property . The table sdf contains the minimally processed input
file, simply split into separate structures, one per row. The property table
contains the names and values of the data items for each structure. The
unprocessed text value of each data item is stored as well as the numeric
value, if it is possible to convert the text value to a number.
The previous section shows a number of utility functions that operate
from the linux command. Those utilities were intended to be used with
the tables created using the smiloader and sdfloader scripts shown
here. It is also possible to use the data in the tables created by these scripts
to create other tables and schemas that are more suited to the needs of a
particular project. Any of the other functions described in this topic can
also be used with these tables.
A.9.1
Smiloader
#! /usr/bin/perl
$schema = $ARGV[0];
die "Schema name required\nusage: loader schema\n" unless
($schema);
print <<EOSQL;
Drop Schema If Exists $schema Cascade;
Create Schema $schema;
Create Sequence $schema.structure_id_seq;
Create Table $schema.structure (id Integer Primary Key Default
Nextval('$schema.structure_id_seq'), name Text, smiles Text,
isosmiles Text, fp Bit Varying);
Copy $schema.structure (smiles,name) From Stdin;
EOSQL
while (<stdin>) {
s/\r//; chomp;
($smi,$name) = split;
print "$smi\t$name\n";
}
print <<EOSQL;
\\.
set search_path=openbabel;
Update $schema.structure Set fp=fp(smiles),
isosmiles=isosmiles(smiles) Where valid(smiles);
EOSQL
Search WWH ::




Custom Search