Graphics Reference
In-Depth Information
Fig. 1.2 DM methods
￿
Artificial Neural Networks (ANNs): are powerful mathematical models suitable
for almost all DM tasks, especially predictive one [ 7 ]. There are different formu-
lations of ANNs, the most common being the multi-layer perceptron (MLP), Ra-
dial Basis Function Networks (RBFNs) and Learning Vector Quantization (LVQ).
ANNs are based on the definition of neurons, which are atomic parts that compute
the aggregation of their input to an output according to an activation function. They
usually outperform all other models because of their complex structure; however,
the complexity and suitable configuration of the networks make them not very
popular when regarding other methods, being considered as the typical example
of black box models. Similar to regression models, they require numeric attributes
and noMVs. However, if they are appropriately configured, they are robust against
outliers and noise.
￿
Bayesian Learning: positioned using the probability theory as a framework for
making rational decisions under uncertainty, based on Bayes' theorem. [ 6 ]. The
most applied bayesian method is Naïve Bayes, which assumes that the effect of
an attribute value of a given class is independent of the values of other attributes.
Initial definitions of these algorithms only work with categorical attributes, due to
the fact that the probability computation can only be made in discrete domains.
Furthermore, the independence assumption among attributes causes these methods
to be very sensitive to the redundancy and usefulness of some of the attributes and
examples from the data, together with noisy and outliers examples. They cannot
deal with MVs. Besides Naïve Bayes, there are also complex models based on
dependency structures such as Bayesian networks.
￿
Instance-based Learning: Here, the examples are stored verbatim, and a distance
function is used to determine which members of the database are closest to a new
example with a desirable prediction. Also called lazy learners [ 3 ], the difference
among them lies in the distance function used, the number of examples taken to
 
Search WWH ::




Custom Search