Chemistry Reference
In-Depth Information
pubchem.nci _ h23 schema. This table definition can be used, changing
just the table name and the comment on the table.
The structure of this table and the data in it can be viewed online using
the phpPgAdmin Web application. 4 While it is possible to simply browse
this table, it is more useful to search for data of interest. This applies to
data in any table, of course. Simple SQL statements to carry out searches
can be entered using the SQL link on the upper right. For example, the
following SQL statement finds rows where the substance is considered to
be active.
Select sid, activity_outcome, "log_gi50_M", log_gi50_ugml From
nci_h23 Where activity_outcome = 2;
This may be of use, but more likely some information about the actual
substances and structures is needed, not just the substance id.
6.4.2 Substances
The substances in PubChem are available as a set of sdf files. The data
in these files can be read by a wide variety of programs. 5 The one most
directly useful here produces a file of SQL commands to create a table
and copy data into it. This sdf2sql program* is available online. 6 Using the
PubChem file Substance_00000001_00025000.sdf.gz, the output of sdf2sql
produces the following:
Create Table substance (
Title text,
BONDANNOTATIONS text,
CID_ASSOCIATIONS text,
COMPOUND_ID_TYPE integer,
EXT_DATASOURCE_NAME text,
EXT_DATASOURCE_REGID text,
EXT_DATASOURCE_URL text,
EXT_SUBSTANCE_URL text,
GENBANK_NUCLEOTIDE_ID text,
GENBANK_PROTEIN_ID text,
GENERIC_REGISTRY_NAME text,
PUBMED_ID text,
SUBSTANCE_COMMENT text,
SUBSTANCE_ID integer,
SUBSTANCE_SYNONYM text,
SUBSTANCE_VERSION integer,
TOTAL_CHARGE integer,
XREF_EXT_ID text);
Copy substance (
* Another approach is to store the properties from the sdf file in a separate table, rather than
as columns in the substance table. This is examined more fully in Chapter 11.
Search WWH ::




Custom Search