Databases Reference
In-Depth Information
Ensembles of Least Squares Classifiers
with Randomized Kernels
Kari Torkkola 1 and Eugene Tuv 2
1
Motorola, Intelligent Systems Lab, Tempe, AZ, USA
kari.torkkola@motorola.com
2
Intel, Analysis and Control Technology, Chandler, AZ, USA
eugene.tuv@intel.com
Summary. For the recent NIPS-2003 feature selection challenge we studied en-
sembles of regularized least squares classifiers (RLSC). We showed that stochastic
ensembles of simple least squares kernel classifiers give the same level of accuracy as
the best single RLSC. Results achieved were ranked among the best at the challenge.
We also showed that performance of a single RLSC is much more sensitive to the
choice of kernel width than that of an ensemble. As a continuation of this work we
demonstrate that stochastic ensembles of least squares classifiers with randomized
kernel widths and OOB-post-processing often outperform the best single RLSC,
and require practically no parameter tuning. We used the same set of very high di-
mensional classification problems presented at the NIPS challenge. Fast exploratory
Random Forests were applied for variable filtering first.
1 Introduction
Regularized least-squares regression and classification dates back to the work
of Tikhonov and Arsenin [17], and has been re-advocated and revived recently
by Poggio, Smale and others [6, 13-15]. Regularized Least Squares Classifier
(RLSC) is an old combination of quadratic loss function combined with regu-
larization in reproducing kernel Hilbert space, leading to a solution of a simple
linear system. In many cases in the work cited above, this simple RLSC ap-
pears to equal or exceed the performance of support vector machines and
other modern developments in machine learning.
The combination of RLSC with Gaussian kernels and the usual choice of
spherical covariances gives an equal weight to every component of the feature
vector. This poses a problem if a large proportion of the features consists of
noise. With the datasets of the challenge this is exactly the case. In order to
succeed in these circumstances, noise variables need to be removed or weighted
down. We apply ensemble-based variable filtering to remove noise variables. A
Random Forest (RF) is trained for the classification task, and an importance
measure for each variable is derived from the forest [4]. Only highest ranking
K. Torkkola and E. Tuv: Ensembles of Least Squares Classifiers with Randomized Kernels ,
Search WWH ::




Custom Search