Database Reference
In-Depth Information
the same cost, which is seldom the case in real world problems. For instance,
when an expert in brain tumors receives a patient who suffers a headache,
he does not recommend the Scanner as a first diagnostic test, although it is
the most effective and accurate one, because the expert has the economic
criteria in mind. Therefore the expert asks simple questions and orders
other more economic tests in order to isolate the simplest cases, and only
recommends such an expensive test for the complex ones.
Effective learning algorithms should also take into consideration cost
in the concept learning process. Most of the currently available algorithms
for classification are designed to minimize zero-one loss or error rate: the
number of incorrect predictions made or, equivalently, the probability of
making an incorrect prediction. This implicitly assumes that all errors are
equally costly. But in most KDD applications this is far from the case.
AccordingtoProvost [ Provost and Fawcett (1997) ] it is hard to imagine a
domain in which a learning system may be indifferent to whether it makes a
false positive or a false negative error. ” Rarely are mistakes evenly weighted
in their cost. In such cases, accuracy maximization should be replaced with
cost minimization. In real-world applications of concept learning, there are
many different types of costs involved [ Turney (1995) ] .Themajorityofthe
learning literature ignores all types of costs. The literature provides even
less guidance in situations where class distributions are imprecise or can be
changed [ Provost and Fawcett (1997) ] .
Countless research results have been published based on comparisons
of classifier accuracy over benchmark data sets. Comparing accuracies on
benchmark data sets say little, if anything, about classifier performance
on real-world tasks [ Provost and Fawcett (1998) ] . Many learning programs
create procedures whose goal is to minimize the number of errors made
when predicting the classification of unseen examples. Few papers have
investigated the cost of misclassification errors [ Provost (1994) ] and very
few papers have examined the many other types of costs.
12.2 Type of Costs
A detailed bibliography of the different types of costs can be found in
[ Turney (2000) ] . The term cost is interpreted in its broadest meaning.
Cost may be measured in many different units, such as monetary units
(dollars), temporal units (seconds), or abstract units of utility. A benefit
can be considered as negative cost. A taxonomy for costs is presented in
[ Turney (2000) ] , and it consists of the following types:
Search WWH ::




Custom Search