Chemistry Reference
In-Depth Information
during a substructure search. For example, the following SQL might be
used to locate structures containing the phenol group.
Select cansmi From atable Where (fkey&key('c1ccccc1O')=fkey)
And matches(smiles,'c1ccccc1O');
In this example, the function named key returns a bit string denoting the
presence or absence of each of the N fragments. The column fkey con-
tains the bit string for each structure, precomputed using key(cansmi) .
The SQL clause where fkey&key('c1ccccc1O')=fkey will be true only
when all bits set to 1 in fkey are also set in key('c1ccccc1O') . In other
words, that clause will be true only when the structure (the row with that
fkey) contains all the fragments that phenol contains. That is necessary
but not fully sufficient for the cansmi of that structure to match phenol.
The final matches function must be used to return the proper set of sub-
structure matches. However, since the comparison of bit strings using the
& operator is much quicker than the matches function, the bit string com-
parison acts as a quick filter. The more time-consuming match function
is evaluated only for those structures that pass the quicker bit string test.
The computation of the fkey using the key function is time-consuming
for a large table of structures, but it need be done only once and stored in
a row with the corresponding SMILES.
The key function used above can be written using SQL, along with a
table of fragments. As a simple example, the fragments shown in Table 8.1
are used. This table could be created using the following SQL.
Create Table fragments (description Text, smarts Text, abit Integer);
The column named smarts contains the SMiles ARbitrary Target
Specification (SMARTS) pattern defining the fragment. The column named
Table 8.1 Simple Fragment Keys
Defined Using SMARTS
Description
Smarts
abit
Phenyl
c1ccccc1
1
Aliphatic alcohol
C[OH]
2
Alcohol
[C,c][OH]
3
Aromatic alcohol
c[OH]
4
Aliphatic ether
COC
5
Ether
[C,c]O[C,c]
6
Aromatic ether
cOc
7
Ketone
O=[CH0](C)C
8
Carboxylic acid
O=[CH0](C)[OH]
9
Aldehyde
O=[CH1]C
10
Search WWH ::




Custom Search