Biomedical Engineering Reference
In-Depth Information
Data sets included in Chem2Bio2RDF ordered by
number of RDF triples
Table 18.1
Data set
Triples
ChEMB
57 795 793
PubChem Bioassay
5 908 479
Comparative Toxicogenomics Data set (CTD)
4 933 484
Miscellaneous Chemogenomics Sets
4 526 267
ChEBI
2 906 076
Database of Interacting Proteins (DIP)
1 113 871
BindingDB
1 027 034
HUGO HGNC (genes)
860 350
KiDB (CWRU)
745 026
UniProt
596 274
PharmGKB
512 361
Human Protein Reference Database (HPRD)
477 697
KEGG (Pathways)
477 697
MATADOR (Chemogenomics)
269 656
BindingMOAD
255 257
DrugBank
189 957
Sider Side Effects Database
127 755
Toxicogenomics Tracking Database (TTD)
116 767
Miscellaneous QSAR sets
32 206
Drug Combination Database (DCDB)
20 891
OMIM
17 251
￿ ￿ ￿ ￿ ￿
Reactome (Pathways)
15 849
Chem2Bio2RDF is shown in Figure 18.2. All of the tools are freely
available for public use, and where possible the code has been submitted
into open source repositories.
Graph theory is well established for the analysis and mining of
networked data, and lends itself naturally to application to RDF
networks. We implemented an algorithm for computing semantic
associations previously applied to social networks [16] for fi nding
multiple shortest or otherwise meaningful paths between any two entities
in a network, to enable all of the network paths within a given path
 
Search WWH ::




Custom Search