Database Reference
In-Depth Information
adaptive multivariate regression splines, neural networks, support vector machines,
and regularization networks. Interestingly, the latter techniques can be interpreted
in the framework of regularization networks [GJP95]. With these techniques, it is
possible to treat quite high-dimensional problems, but the amount of data is limited
due to complexity reasons. This situation is reversed in many practical applications
such as those of recommendations presented here where the dimension of the
resulting problem is moderate but the amount of data is usually huge. Thus, there
is a strong need for methods which can be applied in this situation as well.
We will see that sparse grids can cope with the complexity of the problem,
at least to some extent. Moreover, sparse grids are perfectly suited for data
adaptivity and show many other advantages. They represent a very modern
approach to scoring.
7.2 The Sparse Grid Approach
Classification of data can be interpreted as traditional scattered data approximation
problem with certain additional regularization terms. In contrast to conventional
scattered data approximation applications, we now encounter quite high-
dimensional spaces. To this end, the approach of regularization networks [GJP95]
gives a good framework. The approach allows a direct description of the most
popular neural networks, and it also allows for an equivalent description of support
vector machines and n -term approximation schemes [EPP00, Gir98].
We start with the scoring problem for a given data set S described at the
beginning of this chapter. This is clearly an ill-posed problem since there are
infinitely many solutions possible. To get a well-posed, uniquely solvable problem,
we have to assume further knowledge on f . To this end, regularization theory
[TA77, Wah90] imposes an additional smoothness constraint on the solution of
the approximation problem, and the regularization network approach considers the
variation problem
min
f ∈ V RðÞ
with
M X
M
1
RðÞ¼
Cf x ðÞ , y i
ð
Þ þ λΦ
ðÞ:
ð 7
:
1 Þ
1
Here, C(.,.) denotes an error cost function which measures the interpolation error,
and
V . The first
term enforces closeness of f to the data, the second term enforces smoothness of f ,
and the regularization parameter
Φ
( f ) is a smoothness functional which must be well defined for f
λ
balances between these two terms. Typical
examples are
Search WWH ::




Custom Search