Biomedical Engineering Reference
In-Depth Information
the tool level, allowing users to select which data sets they are comfortable
using, and understanding the caveats in doing so.
18.4 Chem2Bio2RDF architecture
Chem2Bio2RDF is available as a triple store with a SPARQL endpoint,
can be accessed indirectly through a variety of tools, and all of the
data can be downloaded in RDF format from the Chem2Bio2RDF
website [15]. Our Chem2Bio2OWL ontology is also freely available for
download.
As previously described, our data sets are organized into six categories
based on the kinds of biological and chemical concepts they contain.
Some data sources are listed in multiple categories. Some of the data used
were previously employed in relational database format in our prior work
and in this case they were simply converted into RDF/XML via the D2R
server. For the rest of the data sets, we acquired the raw data set (by
downloading from web sites), and converted the data into our relational
database using customized scripts. These are then published as RDF in the
Virtuoso Triple Store. The data can be queried via a SPARQL endpoint.
A list of data sets included in Chem2Bio2RDF is shown in Table 18.1,
along with the number of RDF triples for each set. We have developed a
streamlined process for the addition of new data sets. We adopted
PubChem Compound ID (CID) as the identifi er for compounds, and
UniProt ID for protein targets. The compounds represented by other data
formats (e.g. SMILES, InChi and SDF) were mapped onto the compound
ID via InChi keys. All the triples are stored together and the whole set is
called the Chem2Bio2RDF data set. Initially, we developed a schema to
classify the concepts and the RDF resources in Chem2Bio2RDF. The RDF
data can be explored and queried on our web site ( www.chem2bio2rdf.
org ). Chem2Bio2RDF and its related tools rely heavily on open source
software: a list of open source software used in Chem2Bio2RDF along
with links for where the software can be downloaded is given in Table 18.2.
￿ ￿ ￿ ￿ ￿
18.5 Tools and methodologies that use
Chem2Bio2RDF
We have developed a variety of tools and algorithms that employ
the Chem2Bio2RDF architecture. How some of these relate to
 
Search WWH ::




Custom Search