Chemistry Reference
In-Depth Information
When searching a database, if an isomeric query is used, only struc-
tures with the identical stereochemistry will be found using either a
direct lookup or the matches function. If a nonchiral query is used, the
direct lookup will find matching nonchiral structures, including canoni-
cal SMILES. When a nonchiral query is used in the matches function,
structures of all chirality will be found. There is no one best method for
dealing with a database containing many chiral molecules. It is impor-
tant to carefully consider how to design and search such a database.
7.5.6 Isotopes
It is possible to specify the isotope of any atom in a SMILES string. This
is generally not necessary because the most common isotope is simply
assumed. But if, for example, a database contains information about 13 C,
this can be readily encoded into the SMILES using [13C] instead of simply
C. The [13C] atom is considered different from the normal C atom in a
SMILES. A direct lookup using canonical SMILES will not locate isotopes
of the same structure. A substructure search using the matches function
will locate isotopes. This is because the match function uses SMARTS to
specify the desired substructure.
Isotopes can be used in SMARTS. If no isotope number is speci-
fied in SMARTS, any isotope of the atom will match. For example,
select matches ('N[13C]', 'C') will return true. However, select
m atc h es('SNC','[13C]') will return false. When a specific isotope is
mentioned in SMARTS, then only that isotope number will match.
7.5.7 Salts and Mixtures
Compound mixtures of structures, which include salts, may be encoded
using SMILES. A period between two SMILES means that the compound
SMILES represents two or more noncovalently bonded structures asso-
ciated with each other, such as in a salt. For example, sodium benzoate
can be represented as c1ccccc1C(=O)O.[Na], or possibly c1ccccc1C(=O)[O-].
[Na+]. It may be necessary to define a set of rules about whether to rep-
resent salts using charged atoms or neutral atoms. Even with such a rule
in place, one component of this mixture may be considered the important
compound and the other component the counter-ion or secondary com-
ponent. In some cases, the counter-ion is obviously the smaller of the two
components. This is not always true. Another approach is to define a set of
typical counter-ions. This set may include large groups, such as acetate or
even bigger ions. Creating a table of typical counter-ions can help identify
the primary and secondary components in mixtures.
Search WWH ::




Custom Search