Biomedical Engineering Reference
In-Depth Information
In addition to purely technological challenges, there are issues in the basic approach and scientific
methods available that must be addressed before bioinformatics can become a self-supporting
endeavor. For example, working with tissue samples from a single patient means that the sample
size is very low, which may adversely affect the correlation of genomic data with clinical findings.
There are also issues of a lack of a standardized vocabulary to describe nucleotide structures and
sequences, and no universally accepted data model. There is also the need for clinical data to create
clinical profiles that can be compared with genomic findings.
For example, in searching through a medical database for clinical findings associated with a particular
disease, a standard vocabulary must be available for encoding the clinical information for later
retrieval from a database. The consistency and specificity in a controlled vocabulary is what makes it
effective as a database search tool, and a domain-specific vehicle of communication. As an
illustration of the specificity of controlled vocabularies, consider that in the domain of clinical
medicine, there are several popular controlled vocabularies in use: There is the Medical Subject
Heading (MeSH), Unified Medical Language System (UMLS), the Read Classification System (RCS),
Systemized Nomenclature of Human and Veterinary Medicine (SNOMED), International Classification
of Diseases (ICD-10), Current Procedural Terminology (CPT), and the Diagnostic and Statistical
Manual of Mental Disorders (DSM-IV).
Each vocabulary system has its strengths and weaknesses. For example, SNOMED is optimized for
accessing and indexing information in medical databases, whereas the DSM is optimized for
description and classification of all known mental illnesses. In use, a researcher attempting to
document the correlation of a gene sequence with a definition of schizophrenia in the DSM may have
difficulty finding gene sequences in the database that correlate with schizophrenia if the naming
convention and definition used to search on are based on MeSH nomenclature.
A related issue is the challenge of data mining and pattern matching, especially as they relate to
searching clinical reports and online resources such as PubMed for signs, symptoms, and diagnoses.
A specific gene expression may be associated with "M.I." or "myocardial infarction" in one resource
and "coronary artery disease" in another, depending on the vocabulary used and the criteria for
diagnosis.
Among the hurdles associated with achieving success in biotech are politics and the disparate points
of view in any company or research institution, in that decision makers in marketing and sales, R&D,
and programming are likely to have markedly different perspectives on how to achieve corporate and
research goals. As such, bioinformatics is necessarily grounded in molecular biology, clinical
medicine, a solid information technology infrastructure, and business. The noble challenge of linking
gene expression with human disease in order to provide personal medicine can be overshadowed by
the local issues involved in mapping clinical information from one hospital or healthcare institution
with another. The discussion that follows illustrates the distance between where science and society
are today, where they need to be in the near future, and how computational bioinformatics has the
potential to bridge the gap.