Databases Reference
In-Depth Information
describes the observed behavior. Given a hidden Markov model and the
associated observations, the third problem, also known as the decoding
problem, involves determining the most likely set of hidden states that
have led to those observations. The major drawback of using hidden
Markov models in anomaly detection technique is that it is computationally
expensive, because it uses parametric estimation techniques based on the
Bayes algorithm for learning the normal profile of the host/network under
consideration.
6.6. Ensemble of Classifier
Ensembles of classifiers are often better than any individual classifier. This
can be attributed to three key factors. 30
Statistical:
Machine learning algorithms attempt to construct a hypo-
thesis that best approximates the unknown function that describes the data,
based on the training examples provided. Insucient training data can
lead an algorithm to generate several hypotheses of equal performance.
By taking all of the candidate hypotheses and combining them to form
an ensemble, their votes are averaged and the risk of selecting incorrect
hypotheses is reduced.
Computational:
Many machine learning approaches are not guaranteed
to find the optimal hypotheses; rather, they perform some kind of local
search which may find local minima (rather than the global minimum,
or the optimal hypothesis). For example, decision trees and ANNs
can often produce sub-optimal solutions. By starting the local search
in different locations, an ensemble can be formed by combining the
resulting hypotheses; therefore, the resulting ensemble can provide a better
approximation of the true underlying function.
Representational:
The statistical and computational factors allow
ensembles to locate better approximations of the true hypothesis that
describes the data; however, this factor allows ensembles to expand the
space of representable functions beyond that achievable by any individual
classifier. It is possible that the true function may not be able to be
represented by an individual machine learning algorithm, but a weighted
sum of the hypotheses within the ensemble may extend the space of
representable hypotheses to allow a more accurate representation.
Search WWH ::




Custom Search