Chemistry Reference
In-Depth Information
Insert Into vla4.property (id, name, tvalue)
Select id, (molfile_properties).name, (molfile_properties).value From
(Select id, openbabel.molfile_properties(molfile) From vla4.sdf) atmp
This insert statement is a bit more complex than the one that inserted
rows into the vla4.structure table. In that table, only one row was
returned from the molfile _ mol function. The molfile _ properties
function returns multiple rows for each molfile, when there are multiple
properties for each molfile. The second select statement above (the one
in parentheses and identified with the name atmp ) selects all the rows for
each molfile. The first select statement selects all the columns from each
returned row from molfile _ properties . These are then inserted into
the vla4.property table.
Finally, the nvalue column of the vla4.property table can be populated
when possible. This column stores the numerical value of the property.
Since not all values are numerical, this column may have null entries. The
purpose is to enable efficient use of numerical data when appropriate,
for example, to select by value, sort, apply mathematical functions, etc.
The following SQL will update the nvalue column when possible with
a numeric value. The tilde operator in the where selects text values that
match the regular expression. The expression shown here allows integers,
decimal values, and scientific notation using E or e for the exponent, for
example 6.023E23.
Update vla4.property Set nvalue = tvalue::numeric
Where tvalue ~ E'^[+-]?[0-9]+(\\\\.[0-9]*)?([Ee][+-]?[0-9]+)?\$';
It is possible to create additional columns in the property table, for exam-
ple to contain an integer representation of the data value if that is consid-
ered necessary.
Once the data from the molfiles are extracted and loaded into the
val4.sdf table, the table could be deleted. The vla4.sdf table might
also be retained as a backup of the original sdf file, easily accessible from
within the database for further processing. In either case, the vla4.sdf
table will not create any overhead while using the vla4.structure or
vla4.property tables. It will merely use space in the vla4 schema, in
the same way that a backup copy of the original sdf file would use space
in a folder on the computer's disk.
It may seem that some information will have been lost if the original
molfile is discarded. For example, the list of atomic symbols and bonds is
not stored directly in the vla4.structure table. The cansmiles string,
however, does contain this information. It may be necessary for some pur-
poses to extract this information, for example, when an external program
does not read SMILES but instead requires a list of atomic symbols and
bonds. This is discussed more in the next section. A connection table can
Search WWH ::




Custom Search