Biomedical Engineering Reference
In-Depth Information
Statistical Concepts
Given the breadth of bioinformatics, the statistical concepts relevant to the field could easily fill a
bookcase, much less a single chapter. As listed in Table 6-1 , typical applications of statistics in
bioinformatics range from clinical diagnosis and descriptive summaries to gene hunting and
nucleotide alignment. Many of these applications are far removed from the traditional definition of a
statistic, which is simply a value calculated from a sample. For example, consider that clinicians
dealing with the efficacy of specific therapy in treating a genetic disease typically focus on disease
prevalence (the number of cases of an illness or condition that exists at a particular time in a defined
population). They also assess clinical and genetic tests for the probability of a negative result, given
that the condition under consideration is absent (their specificity), and for the probability of a positive
result, given that the condition under consideration is present (their sensitivity), and for the
predictive value (the probability that a condition is present, based on the results of a test). The
process of diagnosing patients potentially suffering from genetic disorders typically encompasses
quantifying uncertainty and using statistical methods to predict long-term outcomes.
In most cases, statistics are gathered in order to estimate population characteristics or parameters.
Furthermore, these parameters are typically unknown and unknowable. Further still, because a
statistic is an estimate of a parameter, it is likely in error, and much of statistical work is devoted to
quantifying the magnitude of this error.
Table 6-1. Applications of Statistics in Bioinformatics.
Clinical Diagnosis
Descriptive Summaries
Equipment Calibration
Experimental Data Analysis
Gene Expression Prediction
Gene Hunting
Gene Prediction
Genetic Linkage Analysis
Laboratory Automation
Nucleotide Alignment
Population Studies
Protein Function Prediction
Protein Structure Prediction
Quantifying Uncertainty
Quality Control
Sequence Similarity
At this point in the discussion of statistics, it's important to consider the basic concepts of
randomness and probability as they relate to bioinformatics. Biological systems are inherently
random, meaning that they involve variables that have undetermined value but definite probability.
The first fruit fly to escape from a container of 50,000 flies when the container lid is opened may be
male or female, for example. Even though the sex is a random event, the probability is 0.5 that the
 
 
Search WWH ::




Custom Search