Biomedical Engineering Reference
In-Depth Information
popular offerings are listed in Table 7-4 . Some of these applications, such as Oracle Darwin, are tied
to specific database products, whereas others, such as SAS, can be used with any major database
system. Similarly, some of these applications, such as MatLab, support a wide variety of data-mining
capabilities. MatLab is an example of a commercial application that can be extended through a
variety of commercial and public-domain add-ons. If performance isn't a primary concern, then a
researcher with knowledge of Perl or Python, SQL, and MatLab can probably handle any data-mining
challenge.
A sampling of the many academic bioinformatics-specific data-mining tools available include MEME,
Pratt, PIMA, and SPEXS. MEME (Multiple Em for Motif Elicitation) is a motif discovery tool. Pratt, a
stand-alone pattern discovery tool, is designed to uncover patterns conserved in sets of unaligned
protein sequences. The user can specify what kind of patterns should be searched for, and how many
sequences should match a pattern to be reported. The Web-based version of Pratt, PrattWWW,
includes a visualization tool written as a Java applet to display patterns discovered in different
sequences. PIMA (Pattern-Induced Multi-sequence Alignment program) can be used to perform a
multi-sequence alignment of a set of sequences. All pairwise comparisons between sequences in the
set are performed and the resulting scores clustered into one or more families. SPEXS (Sequence
Pattern EXhaustive Search) is a sequence pattern discovery tool.
Because most of the other bioinformatics-specific data-mining tools tend to be optimized for a
specific data-mining application, they tend to be very efficient. The downside of using these specific
tools is the need to learn several different packages if data mining extends from nucleotide
sequences to protein structures.
Search WWH ::




Custom Search