Biomedical Engineering Reference
In-Depth Information
Tool Selection
An arbitrary decision to use median spot fluorescence intensity instead of a mean or mode
measurement, for example, can drastically alter gene expression analysis. Ideally, the selection of a
statistical method reflects the researcher's knowledge of the underlying biological principles as well
as the inherent limitations of the statistical methods used to analyze the data. Researchers typically
consider the statistical methods used when determining whether the data from a particular
experiment is valuable to them.
With the proliferation of multifunction calculators, dedicated statistical analysis software packages
(see Table 6-4 ), and statistical analysis available through general-purpose database and spreadsheet
programs, it's all too easy to statistically analyze research data without considering the underlying
assumptions of the statistical tools used. For example, many of the descriptive statistics assume that
the population data—the parameters—follow a known and definable distribution, even though the
distribution may be unknown. Similarly, even though Bayes' Theorem assumes independence of
variables, it's often used to estimate probabilities of co-occurring events that may be linked in some
way. In addition, it's possible to spend months on an experimental design and end up with worthless
data because the sample size or composition of the experimental groups is insufficient to address the
question at hand. In the vernacular of statisticians, the experimental design has insufficient power to
reject the null hypothesis.
Table 6-4. Statistical Analysis Tools. This sample is representative of the
thousands of tools available on the market for statistical analysis.
Type of Tool
Examples
Dedicated, General-Purpose SAS, Minitab, Matlab, Decision Pro, MVSP, SimStat, NCSS, PASS,
SISA, Statistica, S-Plus, R, Splus, SPSS, Perl, SigmaStat, Statview,
Prism, Mathematica, ProStat
Ancillary, General-Purpose
Microsoft Excel
Bioinformatics-Specific
BLAST, VAST, BioConductor
Excel Add-Ons
Analyse-it, XLStat, XLStatistics
Selecting the statistical methods and tools most appropriate for a problem requires an understanding
of the assumptions of the available statistical methods, the underlying biology, the data
requirements, the validity of the overall experimental design, and computational requirements. One
way of assessing the performance of a set of statistical tools is to determine its sensitivity and
specificity. Given a criterion for when to call a test abnormal, sensitivity is the percentage of actual
positives that are counted as positive, whereas specificity is the percentage of actual negatives that
are rejected. Expressed another way, sensitivity is the number of true positives divided by the sum of
true positives and false negatives, as illustrated in Figure 6-20 . Similarly, specificity is the number of
true negatives divided by the sum of false positives and true negatives.
Figure 6-20. Sensitivity and Specificity. Both are a function of the number
of true and false positives and negatives. Moving the cutoff value (vertical
bar) to the right (dotted line) results in almost no false positives at the
expense of fewer true positives.
 
 
Search WWH ::




Custom Search