Chemistry Reference
In-Depth Information
substance
nci_h23
title
bondannotations
cid_associations
comp ound_id_type
ext_datasource_name
ext_datasource_url
ext_substance_url
genbank_nucleotide_id
genbank_protein_id
generic_registry_name
pubmed_id
PK substance_id*
substance_comment
substance_synonym
substance_version
total_charge
xref_ext_id
TEXT
TEXT
TEXT
INTEGER
TEXT
TEXT
TEXT
TEXT
TEXT
TEXT
TEXT
INTEGER
TEXT
TEXT
TEXT
INTEGER
TEXT
FK sid*
ext_datasource_regid
cid
activity_outcome
activity_score
activity_url
assaydata_comment
assaydata_revoke
log_gi50_M
log_gi50_ugml
log_gi50_v
indngi50
stddevgi50
logtgi_m
logtgi_ugml
indntgi
stddevtgi
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
TEXT
TEXT
TEXT
NUMERIC
NUMERIC
NUMERIC
INTEGER
NUMERIC
NUMERIC
NUMERIC
INTEGER
NUMERIC
Figure 6.3 Entity-relationship diagram for pubchem.substance and pubhchem.
nci_h23 tables.
Figure 6.3 shows the relationship between the pubhcem.nci _ h23
and pubchem.substance tables in the form of an entity-relationship
diagram (ERD). The primary key substance.substance _ id and the
foreign key nci _ h23.sid are indicated and imply their use in an On
clause when these two tables are joined.
6.4.3 Compounds
The third set of files from the PubChem repository describes chemical
compounds. These are distributed as sdf files and are identified using a
unique compound id. There are also multiple properties associated with
each compound. Using the sdf2sql file utility described above, the table
pubchem.compound is created. The compound table can then be used to
locate compounds by searching any of the columns of data; for example,
Select * From pubchem.compound Where iupac_name Like '%aldehyde%'
And heavy_atom_count < 20;
would select small aldehydes. When used in conjunction with the bio-
logical assay data and substance table, the compound table becomes even
more useful.
From the examples in the previous section, it is clear how the sub-
stance id relates pubchem.substance to biological assay data and how
substance data can be selected using the substance id. How can the
compound table be used to select compound data for substances appear-
ing in one of the biological assay data tables? In other words, how is the
Search WWH ::




Custom Search