Biomedical Engineering Reference
In-Depth Information
Data Category
Examples
Data Sources
Patient, Clinical Studies, Genomic Studies, Public Databases, Private Databases
Applications
Search Engines, Statistical Analysis, Visualization, Simulation, Communications,
Database Management System, Electronic Medical Record, Genomic
Databases
Public, Private, Taxonomy, Clinical, Genetic, Local, External, Archives
Data Formats
FASTA, PHYLIP, MAML, NEXUS, PAUP, FASTA+GAP, and MmCIF, Proprietary
Clinical Formats, Local Application Formats
Interfaces
Local Databases, Online Databases, Data Warehouse, Application
Integration Tools Data Dictionary, Network, Standards
Furthermore, many of the dozens of databases involved in pharmacogenomic research and
development use proprietary formats. This is especially true of clinical systems, many of which are
specialty-specific. For example, standard image formats for radiology databases include Digital
Imaging and Communications in Medicine (DICOM) and the American College of Radiology/National
Electrical Manufacturers Association (ACR/NEMA) standards. These standards were developed
primarily to facilitate multi-vendor connectivity to promote the development of Picture Archiving and
Communications Systems (PACS), but they have no provision for linking images with genomic
systems, such as gene expression databases.
The typical research laboratory must develop and maintain numerous interfaces between applications
and databases to provide the logical connectivity for data communications through the network
infrastructure. The simple network illustrated in Figure 2-2 glosses over the inner complexity of the
dozens of standards used through a typical information system, a problem at least partially
addressed by data dictionaries and conversion utilities. For example, few laboratories or medical
facilities provide the degree of connectivity suggested by this discussion. The vast majority of
hospitals in the U.S. use paper charts to record patient history and physical findings, for example.
Perhaps 5 percent of hospitals have a functional EMR, and most of these are partial implementations
that provide only summary information. Furthermore, these systems typically require researchers
and clinicians to learn several arcane languages and procedures to access all data that may be
relevant to a given patient. For example, clinicians may have to log in to a pathology system to check
urinalysis results, a radiology system to read the report on a patient's latest image studies, and an
admission, discharge, transfer (ADT) system to verify the patient's insurance provider. Similarly,
although many clinical studies are multimedia-rich, most radiology and pathology images, EKG
tracings, pulmonary function test curves, and other graphical materials are maintained in separate
databases that aren't connected to the main hospital or clinic network.
One approach to minimizing or hiding the complexity of the data-management process is to create a
single, integrated user interface. Just as the Windows or Macintosh operating systems hide the
complexity of computer operations from users, a unified user interface to a network of disparate
applications can hide the complexity of the data sources and various applications used to manipulate
the data. This unified user interface may take the form of a Web portal or the workstation's operating
system. For example, the flavors of UNIX for the PC, Macintosh, and dedicated UNIX workstations
each provide various views of local and networked applications. The challenge with hiding complexity
this way is that the constant changes in how data are actually managed in the background requires
parallel updating of the user interface that provides a front end to the system.
The data-management process is much more involved than simply sending data to a database and
retrieving it later. As discussed in the following sections, the databases used in bioinformatics
research presents a variety of challenges, many of which pertain to all phases of the data life cycle,
issues such as security, standards, interoperability, longevity of data, access and version control, the
use of encryption, and minimizing access time. The data life cycle and the relevant issues that arise
at each stage in the life of data are discussed in the rest of this chapter. Finally, issues that pertain to
Search WWH ::




Custom Search