Biomedical Engineering Reference
In-Depth Information
pharmacogenomic laboratory discussed at the beginning of this chapter. The six-stage process
usually involves these phases: planning; data consolidation; data transformation; selective archiving;
data distribution; and ongoing maintenance.
In the planning stage, arguably the most important phase of data warehouse development,
representatives from administration, R&D, and information technology departments decide exactly
what to include in the data warehouse. Ideally, the data warehouse content should reflect the
questions likely to be asked. For example, researchers might want to correlate microarray values
with specific clinical diagnoses, and administrators might want to compile summaries of average
sequence run costs. Because of practical cost, resource, and performance limitations, it's normally
impossible to store every data element from every application in a data warehouse. The planning
phase directly impacts the eventual cost and functionality of the data warehouse.
In the consolidation phase, the selected data from each application database are restructured. This
typically involves adding fields and relations to reflect how the data will be used in the data
warehouse. The goal in the consolidation phase is to provide an efficient framework that supports
queries likely to be asked, as determined in the planning stage.
The data transformation stage of data warehouse development involves transforming the
consolidated data into a more useful form through summarization and packaging. In summarization,
the data are selected, aggregated, and grouped into views more convenient and useful to users.
Packaging involves using the summarized data as the basis of graphical presentations, animations,
and charts.
Selective archiving involves moving older or infrequently accessed data to tape, optical, or other long-
term storage media. Archiving saves money by sparing expensive magnetic, high-speed storage, and
minimizes the performance hit imposed by locally storing data that is no longer necessary for
outcomes analysis.
The distribution phase makes data contained in the data warehouse available to users. Providing for
distribution encompasses front-end development so that users can easily and intuitively request and
receive data, whether in real-time or in the form of routine reports. Push technologies, including e-
mail alerts, can be used to distribute data to specific users. The Web is also a major portal for
accessing the data.
Maintenance is the final, ongoing stage of data warehouse development. However, creating a data
warehouse involves much more than simply designing and implementing a database. Even if there is
a process in place for extracting, cleaning, transporting, and loading data from sequence machines,
bibliographic reference databases, and other molecular biology applications, and distribution tools are
both powerful and intuitive, the data warehouse may not be sustainable in the long-term. For
example, the process of extracting, cleaning, and reloading data can be prohibitively expensive and
time-consuming. A sustainable data warehouse provides a real benefit to users to the degree that not
only is the return worth the original development, but that it is valuable enough to warrant continual
redesigning and evaluation to meet changing demands.
Infrastructure
From a hardware perspective, implementing a database requires more than servers, large hard
drives, perhaps a network and the associated cables and electronics. Power conditioners and
uninterruptible power supplies are needed to protect sensitive equipment and the data they contain
from power surges and sudden, unplanned power outages. Providing a secure environment for data
includes the usual use of username and passwords to protect accounts. However, for higher levels of
assurance against data theft or manipulation, secure ID cards, dongles, and biometrics (such as
voice, fingerprint, and retinal recognition) may be appropriate.
Secure ID cards are credit card-sized pseudorandom number generators that are synchronized with a
similar generator on the server. Users enter the 16-digit number displayed on the secure ID card for
Search WWH ::




Custom Search