Information Technology Reference
In-Depth Information
Instead, they are found by trial and error (Sutton andBarto 1998 ). Acommon example
assumes an environment that interacts with an agent which can take actions that allow
him to shift between different states. The best actions are guided by the evaluation
of a reward function related to each problem.
Humanoid Robotics is one of the research areas where Reinforcement Learning
(RL) has shown its great potential (Vijayakumar et al. 2003 ). Although there are
still many difficulties to solve RL problems when the complexity of the problem
increases (e.g. when the dimensionality is too high or states are continuous), there
have been successful real life samples that have demonstrated its applicability such
as the cart-pole which automatically controls an inverted pendulum (Doya 2000 ).
2.5.2 Machine Learning Approaches
Several ML modeling approaches have been developed throughout the years in order
to solve different tasks such as classification, regression and clustering (Bishop 2006 ).
Some of them are based on deterministic models which aim to find fixed causal rela-
tionships between events. Other approaches, on the other hand, are probabilistic and
assume occurring events are generated from a probability distribution. Combinations
of these approaches have also been explored such as in (Franc et al. 2011 ).
In the following list, the most popular ML algorithms are briefly described. Then,
in the next section we make particular focus on SVMs as they are the central ML
algorithm employed in this thesis.
Decision Tree (DT): is a predictive model based on decision trees which makes
choices from a set of hierarchical rules related to the input data. Different versions
have been proposed such as ID3 and C5.4 (Quinlan 1986 , 1993 ). It is a common
approach for classification particularly because the resulting models are easily
interpretable by humans (due to its intrinsic tree structure).
Random Forest (RF): is an ML meta-classifier which is built using an ensemble
of DTs. The predicted class is chosen as the most frequently occurring amongst
the output of each DT (Breiman 2001 ).
k-Nearest Neighbors (k-NN): this deterministic learning approach exploits simi-
larity measures between data for classification and regression tasks. Given a new
sample, the approach finds the k closest samples from a training set to decide the
prediction outcome with their values (e.g. by using majority rule in classification
or averaging in regression) (Altman 1992 ). Its main disadvantage relies on the size
of its model as it is data-dependent and makes it unfeasible in large datasets. There
are, however, versions which consider data reduction techniques for alleviating
this issue.
Naive Bayes (NB): is a popular probabilistic classifier based on Bayes's theorem
that predicts the class of a given sample by assuming an underlying probability
model of the data and making strong independence assumptions between its fea-
tures. Even though its formulation is quite simple, it has shown to perform well
 
Search WWH ::




Custom Search