Biomedical Engineering Reference
In-Depth Information
Technology Overview
The remainder of this chapter provides an overview of the key technologies that can be applied to
data mining, especially those capable of supporting the basic data-mining methods outlined earlier.
As a prelude to this discussion, it's important to note that an efficient and effective data-mining
system requires, above all, an experimental design that reflects the biology of the data being mined.
In this regard, technology is an empowering agent that provides leverage to facilitate a well-designed
data-mining initiative—technology isn't a solution in itself. Simply connecting a black box to a
database with hopes of it turning up fruitful information on previously hidden relationships in the data
is unlikely at best.
Given this caveat, data mining requires a hardware and software infrastructure capable of supporting
high-throughput data processing and a network capable of supporting data communications from the
database to the visualization workstation. With a robust hardware and software infrastructure in
place, processes such as machine learning can be used to automatically manage and refine the
knowledge-discovery and data-mining processes. This work can be performed with minimal user
interaction once a knowledgeable researcher has established the basic design of the system.
The core technologies that actually perform the work of data mining, whether under computer control
or directed by users, provide a means of simplifying the complexity and reducing the effective size of
the databases. This focus isn't limited to genome sequences and protein structures, but extends to
the wealth of data hidden in the online literature. Advanced text-mining methods are used to identify
textual data and place them in the proper context.
Finally, as discussed later in this chapter, although data mining was once relegated to internal
research groups, the technology is readily available today through a variety of commercial and
academic shareware tools. These tools range from shrink-wrapped, general-purpose software tools to
bioinformatics-specific commercial and academic systems designed for highly specific data-mining
applications.
 
 
Search WWH ::




Custom Search