Biomedical Engineering Reference
In-Depth Information
and the beak of a bird. Unrelated species frequently converge upon similar
morphologic solutions to common environmental conditions or shared physio-
logical imperatives. Algorithms that cluster organisms based on similarity are
likely to group divergent organisms under the same species.
It is often assumed that computational classification, based on morpho-
logic feature similarities, will improve when we acquire whole-genome
sequence data for many different species. Imagine an experiment wherein
you take DNA samples from every organism you encounter: bacterial
colonies cultured from a river, unicellular non-bacterial organisms found in
a pond, small multicellular organisms found in soil, crawling creatures
dwelling under rocks, and so on. You own a powerful sequencing machine,
that produces the full-length sequence for each sampled organism, and you
have a powerful computer that sorts and clusters every sequence. At the
end, the computer prints out a huge graph, wherein all the samples are
ordered, and groups with the greatest sequence similarities are clustered
together. You may think you've created a useful classification, but you
haven't really, because you don't know anything about the organisms that
are clustered together. You don't know whether each cluster represents
a species, or a class (a collection of related species), or whether a cluster
may be contaminated by organisms that share some of the same gene
sequences, but are phylogenetically unrelated (i.e., the sequence similarities
result from chance or from convergence, but not by descent from a com-
mon ancestor). The sequences do not tell you very much about the
biological properties of specific organisms, and you cannot infer which bio-
logical properties characterize the classes of clustered organisms. You have
no certain knowledge whether the members of any given cluster of organ-
isms can be characterized by any particular gene sequence (i.e., you do not
know the characterizing gene sequences for classes of organisms). You
do not know the genus or species names of the organisms included in the
clusters, because you began your experiment without a presumptive taxon-
omy. Basically, you simply know what you knew before you started; that
individual organisms have unique gene sequences that can be grouped by
sequence similarity. A strictly molecular approach to classification has its
limitations, but we shall see, in Chapter 4, that thoughtful biologists can
use molecular data to draw profound conclusions about the classification of
living organisms.
Taxonomists are constantly engaged in an intellectual battle over the
principles of biological classification. They all know that the stakes are high.
When unrelated organisms are mixed together in the same class, and when
related organisms are separated into unrelated classes, the value of the classi-
fication is lost, perhaps forever. To understand why this is true, you need to
understand that a classification is a hypotheses-generating machine. Species
within a class tend to share genes, metabolic pathways, and structural anatomy.
Shared properties allow scientists to form general hypotheses that may apply
Search WWH ::




Custom Search