Introduction - Design and Use of Relational Databases in Chemistry

Chemistry Reference

In-Depth Information

and maintain robust and powerful chemical relational databases. And

all four languages begin with S! Finally, you will see how you can use

your familiarity with perl, python, or C to implement new functions in

the database.

Much chemical data is stored in computer files, some of which have

little or no structural organization. Some data files are more structured,

perhaps in tabular form or as an Excel spreadsheet. There are many simi-

larities between spreadsheet files and relational tables in a database.

However, storing data in a relational database offers many advantages

not possible when data is stored in files. The greatest advantage comes

from the proper design and use of tables themselves. Chapter 2 shows

how to design and use tables to store and search numerical or text data.

The reason for using multiple tables is explained and the use of relation-

ships among tables is examined. Finally, the entity-relationship diagram

is shown as an aid to designing and understanding a database of tables.

An introduction to SQL is provided in Chapter 3, but with an emphasis

on examples relevant to chemical information rather than business infor-

mation, which is often used in other topics. Chapter 4 discusses some of

the RDBMS that are available, namely Oracle, MySQL, and PostgreSQL.

All of them use SQL to insert, delete, update, and select data. Chapter 5

shows ways in which client programs, including Web-based applications,

are used to connect to the database server. Chapter 6 examines ways in

which RDBMS are typically used to handle numerical and textual chemi-

cal information using relational tables. An example of using data files

from the PubChem project is included.

Chapter 7 introduces ways in which RDBMS can be used to handle

chemical structural information using SMILES and SMARTS represen-

tations. It shows how extensions to relational databases allow chemical

structural information to be stored and searched efficiently. In this way,

chemical structures themselves can be stored in data columns. Once

chemical structures become proper data types, many search and compu-

tational options become available. Conversion between different chemical

structure formats is also discussed, along with input and output of chemi-

cal structures.

Chapter 8 shows ways in which molecular fragments can be used

to speed up searches for chemical structures. Both path-based and frag-

ment-based methods are discussed. Several types of molecular similarity

are explained using bit-string fingerprints representing the presence or

absence of various fragments. Finally, it is shown how tables of fragments

along with parameter values for these fragments can be used to compute

theoretical molecular properties.

Search WWH ::

Custom Search

Home