Measuring Software Reliability: A Trend Using Machine Learning Techniques - Complex System Modelling and Control Through Intelligent Soft Computations

Information Technology Reference

In-Depth Information

4.2 Machine Learning: Prologue on Support Vector

Regression

Support Vector Machines (SVM), originally used for classi

cation purposes, can

also be applied to regression problems by introducing an alternative loss function as

already stated in the previous section. Support Vector Regression (SVR) maps the

input data x into a higher dimensional feature space F by nonlinear mapping and then

a linear regression problem is obtained and solved in this feature space. The goal is to

from the actually obtained output

y for all the training data, that is, all errors less than are accepted but not more than

that. Given a set of N training data

find a function f(x) that has a maximum deviation

R n

N, where x i

denotes the input vector of dimension n, yi i is the corresponding target value, and n is

the total number of data patterns. The linear regression function is:

x i ;

y i Þj

x i 2

;

y i 2

;

; ...;

R n and b

Þ¼h

;

iþ

;

ð 3 Þ

Here, w is the weight vector and b is the bias term. To estimate the value of w

and b for the selection of the best hype plane, we need to minimize the following

regularized risk function:

c X

2 jj

L ð

y i ;

x i ÞÞ

ð 4 Þ

i ¼ 1

where, the

first term is the regularized term which represents the ability of pre-

diction for regression, and the second term is the empirical error or risk, wherein the

constant C > 0 determines the trade-off between the training errors and the model

complexity. The

-loss insensitive function, present in the second term of the risk

function, is de

ned as:

;

Þj

L ð y ; f ð x ÞÞ ¼

y f ð x Þ;

j y f ð x Þj [ a

This function gives the loss incurred by predicting f(x) instead of y. Now, we

introduce two slack variables i and i* in the above regression estimation problem to

transform it into an equivalent constrained optimization problem. The loss function

and the slack variables allow the presence of noisy data; here noisy data refers to

those data points which lie outside the

Îµ

tube, i is the positive difference between the observed value and Îµ , and if the

observed point

-tube. If the observed point is above the

is below the

-tube,

is the negative difference between the

observed value and

. Hence, the constrained optimization problem formed amounts

to minimizing the following equation:

Complex System Modelling and Control Through Intelligent Soft Computations

Search WWH ::

Custom Search

Home