Biomedical Engineering Reference
In-Depth Information
By the time this work was done, to our knowledge, there was no database joining
the proteomes of major and minor salivary glands. For this reason, the first step was
to compile this information from different sources. The proteome data of major sali-
vary glands (parotid, submandibular/sublingual) were obtained from the Salivary
Proteome Knowledge Base and from Yates Lab, The Scripps Research Institute. The
proteome of human minor salivary gland secretion was obtained from Oppenheim
Laboratory, Henry M. Goldman School of Dental Medicine, Boston University. The
proteins identified in different studies were compared and repeated entries eliminated.
Biological information is constantly being updated. Since the first publication of
saliva proteomes, many of the originally identified proteins, catalogued as different
entries in biological databases, have been merged with others and some deleted due to
misidentification. Therefore, all information concerning the identified proteins was
manually curated and updated. The update of the IPI (International Protein Index)
entries was carried out with the “IPI History Search” (www.ebi.ac.uk/IPI) tool. All
other updates have been made using the UniProt database.
3.2
Oral Cavity Data Integration
The orthogonal nature and innate heterogeneity associated with life science resources
have always hampered easier developments regarding the integration of distributed
data. Furthermore, research in this field has entered a cycle where computational solu-
tions lag one step behind technological requirements in biology. This brought about a
growing disparity regarding bioinformatics software, where a few well-known and
widely used resources, such as UniProt or NCBI, co-exist with hundreds of smaller
independent tools.
Although the oral cavity presents a narrower scope, it involves assorted life science
fields, from microorganisms to proteins or diseases. Establishing new connections
amongst these diverse entities creates a high degree of complexity, thus requiring the
development of new ad-hoc data integration software solutions. On the one hand,
large warehouses that might contain this domain-specific information also contain
many other resources. Consequently, researchers are overwhelmed by huge datasets,
making their data of interest impossible to find. For instance, discovering oral cavity
information amongst UniProt is a nightmarish task.
From a technological perspective, there are miscellaneous strategies for solving da-
ta integration problems. However, they all rely on three elementary concepts: ware-
housing, middleware and link integration. Warehouse approaches intend to support an
efficient decision-making process, requiring the aggregation of all desired data in a
huge central dataset [26]. Middleware-based solutions rely on the development of
specific wrappers to mediate connections between users' requests and original data
servers [27]. Finally, link-based integration attempts to connect heterogeneous data
types by creating graphs or networks based on pointers between distinct data units
[28]. These approaches can be distinguished by the way they treat aggregated data.
Warehouses replicate entire resources, creating a truly integrated environment, whe-
reas middleware or link-based solutions only provide streamlined access to data, re-
sulting in virtual integration.
Although many examples can be found for each integration strategy, the most
common solutions involve developing a hybrid architecture, where some data is
Search WWH ::




Custom Search