The Central Dogma - Bioinformatics Computing

Biomedical Engineering Reference

In-Depth Information

In addition to purely technological challenges, there are issues in the basic approach and scientific

methods available that must be addressed before bioinformatics can become a self-supporting

endeavor. For example, working with tissue samples from a single patient means that the sample

size is very low, which may adversely affect the correlation of genomic data with clinical findings.

There are also issues of a lack of a standardized vocabulary to describe nucleotide structures and

sequences, and no universally accepted data model. There is also the need for clinical data to create

clinical profiles that can be compared with genomic findings.

For example, in searching through a medical database for clinical findings associated with a particular

disease, a standard vocabulary must be available for encoding the clinical information for later

retrieval from a database. The consistency and specificity in a controlled vocabulary is what makes it

effective as a database search tool, and a domain-specific vehicle of communication. As an

illustration of the specificity of controlled vocabularies, consider that in the domain of clinical

medicine, there are several popular controlled vocabularies in use: There is the Medical Subject

Heading (MeSH), Unified Medical Language System (UMLS), the Read Classification System (RCS),

Systemized Nomenclature of Human and Veterinary Medicine (SNOMED), International Classification

of Diseases (ICD-10), Current Procedural Terminology (CPT), and the Diagnostic and Statistical

Manual of Mental Disorders (DSM-IV).

Each vocabulary system has its strengths and weaknesses. For example, SNOMED is optimized for

accessing and indexing information in medical databases, whereas the DSM is optimized for

description and classification of all known mental illnesses. In use, a researcher attempting to

document the correlation of a gene sequence with a definition of schizophrenia in the DSM may have

difficulty finding gene sequences in the database that correlate with schizophrenia if the naming

convention and definition used to search on are based on MeSH nomenclature.

A related issue is the challenge of data mining and pattern matching, especially as they relate to

searching clinical reports and online resources such as PubMed for signs, symptoms, and diagnoses.

A specific gene expression may be associated with "M.I." or "myocardial infarction" in one resource

and "coronary artery disease" in another, depending on the vocabulary used and the criteria for

diagnosis.

Among the hurdles associated with achieving success in biotech are politics and the disparate points

of view in any company or research institution, in that decision makers in marketing and sales, R&D,

and programming are likely to have markedly different perspectives on how to achieve corporate and

research goals. As such, bioinformatics is necessarily grounded in molecular biology, clinical

medicine, a solid information technology infrastructure, and business. The noble challenge of linking

gene expression with human disease in order to provide personal medicine can be overshadowed by

the local issues involved in mapping clinical information from one hospital or healthcare institution

with another. The discussion that follows illustrates the distance between where science and society

are today, where they need to be in the near future, and how computational bioinformatics has the

potential to bridge the gap.

Search WWH ::

Custom Search

Home