Biomedical Engineering Reference
In-Depth Information
replicated whilst other data are simply connected through links and identifiers. This
approach was integrated into the Oralome project.
Once target resources and data were identified, the modelling iteration started. This
task consisted of designing a common information model to support oral cavity data
from distinct resources.
Before the actual data integration, a system skeleton needed to be deployed. As
mentioned above in this article, there are several frameworks designed for rapid pro-
totyping of data portals for life science projects, such as LOVD or GMODWeb. For
this specific task, we have chosen the Molgenis framework for its agility in creating a
database and application, complete with data exploration web workspace, REST and
SOAP web services, and R interface out of the box. For the data integration process,
Molgenis provides easy and direct data input, whether through the web interface,
through any of the available services, or through a provided database API. Therefore,
custom data wrappers, collecting data from miscellaneous resources, can be easily
implemented. Oralome required the deployment of general-purpose wrappers, com-
bining external data in the newly deployed Molgenis instance. These wrappers allow
for systematic information extraction from resources such as UniProt, NCBI or
STRING, amongst others. These resources provide several ways to retrieve informa-
tion, such as REST interfaces or APIs for Java development.
Executing this streamlined data integration workflow, curated oral cavity data is
collected and re-organized in a publicly available web framework.
3.3
Oralome Development
Oralome consist of a set of tools and a database that provide access to information
related to several entities, such as microorganisms, proteins, diseases and pathways,
integrating crucial data regarding the oral cavity.
The upper entity is a microorganism which has several associated proteins. A pro-
tein itself has other identifiers linked to it, such as OMIM (Online Mendelian Inherit-
ance in Man), KEGG (Kyoto Encyclopedia of Genes and Genomes), PDB (Protein
Data Bank) and GO (Gene Ontology) terms. The main subject for this tool consists of
two groups of proteins: (1) a subset of microbial proteins determined experimentally,
and (2) microbial proteins expected to be present in saliva. Regarding the first group,
besides the information retrieved from UniProt, Oralome will integrate information
related to the environment where a protein was identified (health or disease, regula-
tion, age group, and the particular source where it resides, for instance, mucosa or
tongue).
For Oralome tool development we chose the Molgenis framework for generating
all the necessary tools and features needed to start compiling our database and to view
this data in an easy and rapid way.
Molgenis consists of a framework written in Java, which accepts two XML files as
input: a database and a user interface descriptor file. Using the first file, users can
specify how the database will be structured, its entities and relations; the second file
specifies the layout for the web interface. Molgenis generates a Java model and a
database API which are used to deploy the related SQL tables, web services and web
interface into a web server (Fig. 2).
Search WWH ::




Custom Search