Biomedical Engineering Reference
In-Depth Information
Endnote
Looking to the immediate future, the database technologies that will most likely have a significant
impact on bioinformatics are the ones that deal with systems integration, the process in which
disparate computer applications and systems can share data. Because the applications in a typical
biotech laboratory are often cobbled together from different vendors and custom, in-house
development, and may be running on multiple generations of hardware, system integration is still a
custom-programming task. As a result, integrating every database in an organization can take
months of effort, considerable expense, and have only mixed results. Part of the challenge is that,
due to the relative youth of the bioinformatics arena, the market has yet to respond to the need for
commercial integration tools that address the specific needs of the community. Two areas in which
rapid innovation is required for database integration and overall improved interoperability of
bioinformatics tools are vocabulary standards and DBMSs.
Although organizations such as NCBI and the National Library of Medicine are actively involved in
developing tools for the molecular biologist working in the field of bioinformatics, a vocabulary of
bioinformatics has yet to be defined. As a result, most data warehouses and data dictionaries are
based on ad-hoc compilations of existing vocabularies with additions made on an as-needed basis.
Part of the challenge of creating a standard bioinformatics vocabulary is determining the appropriate
level of granularity needed to adequately describe everything from nucleotide sequences and protein
structure to species data. This challenge is intensified as the focus of bioinformatics research shifts
from nucleotide sequencing to proteomics, which necessarily includes phenotypic expression data
stored in clinical systems. As a result, an all-encompassing vocabulary must increasingly incorporate
data in the medical record and public health as well.
In the area of DBMSs, although the relational model currently dominates the market, the complexity
of clinical and laboratory data is driving many researchers to seriously consider other DBMS
technologies, such as object-oriented DBMSs. While there is a great deal of interest in object-oriented
approaches to supporting bioinformatics computing, the information technology community is still
expressing caution toward the technology. This is partly because many object-oriented database
systems are incomplete, in that they lack backup and recovery functions. In addition, data models
often conflict, the languages supported by vendors are proprietary, scalability is unproven, and the
systems require huge amounts of memory and computational resources. In the recent past, vendors
have partially addressed these and other limitations of ODBMs, but performance and scalability
concerns remain.
Several vendors are building what they consider the next generation of bioinformatics database
systems, but it's uncertain which of these systems will establish a standard. As such, the most
promising technologies in the systems integration arena are aimed at the general computing market,
such as Web Services, Storage Area Networks, Storage Service Providers, or Application Service
Providers. Time will tell which of these models, if any, can be shown to be economically—as opposed
to simply technologically—viable. In most cases, this translates to technologies that are transparent
to the research workflow, thereby augmenting current processes and contributing to effectiveness of
R&D.
By far the most significant challenges surrounding the effective use of database technology in
bioinformatics relate to issues of security, privacy, and bioethics, and how these issues will eventually
affect legislation that will either support or hamper advances in the field. Consider the privacy and
security issues associated with having an individual's medical records and DNA analysis available
online and instantly available to teachers, employers, the courts, police, the FBI, and, inevitably,
hackers. For now, the challenge is achieving the level of database integration that would make these
issues a reality. At best, integration is limited to what Internet and intranet technology can support,
through both fixed or hard-wired links and, more commonly, through dynamic links provided by
online search engines. As described in Chapter 4 , "Search Engines," significant progress in molecular
biology database integration is being made in this arena.
 
 
Search WWH ::




Custom Search