Chemistry Reference
In-Depth Information
chapter 1
Introduction
The goal of this topic is to convince you that relational databases are the best
way to store, search, and even operate on chemical information. Whether
the database contains a hundred structures or ten million, a relational
database provides ways to ensure data integrity, to formalize relationships
among the data, and to extend the database when new data become avail-
able or when new ways of operating on data become of interest.
Some readers of this topic will have a background in chemistry and
wish to see how databases might assist their work. Some readers will
already have a background in programming and databases. After reading
this topic, you should have the ability to understand an existing relational
database schema or design a new schema containing tables of data and
chemical structures. You will learn how to take advantage of database
extension “cartridges” that provide ways of properly storing and search-
ing chemical structures, not just numerical or textual data. You will see
how you can download and install a fully functioning database with free
and open-source chemical extension cartridges. You will also see how the
database can be accessed on a computer network using existing applica-
tions or ones that you wish to write.
There are many topics that describe relational database manage-
ment systems (RDBMS) and the structured query language (SQL) used
to manipulate the data. Understanding SQL is important, and this topic
contains an introduction to SQL. However, the focus is on the concepts
of relational data. One goal is to show how a proper integration of a new
molecular structure data type yields a powerful, extended relational data-
base for use in chemistry. For those of you new to relational databases, it
is expected that the SQL introduction will suffice for your understanding
of the concepts in this topic. For those of you already familiar with SQL,
it is hoped that you will see how the extensions described here provide a
powerful, integrated way to handle molecular structures within the data-
base. In either case, there are plenty of practical SQL examples contained
in this topic.
Much of this topic is a discussion of computer languages. SQL is a
type of programming language. Becoming fluent in SQL helps make the
most of a relational database. SMILES, SMARTS, and SMIRKS are chemi-
cal computer languages that express many fundamental aspects of chemi-
cal structure. Becoming fluent in all these languages will help you create
1
Search WWH ::




Custom Search