Machine Learning - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

Chapter 22

Machine Learning

In this chapter, we gather some key machine learning concepts that are related to this

topic. This is not intended to be an introductory tutorial, and it is assumed that the

reader already has some background on machine learning. We will first review some

basic supervised learning problems, such as regression and classification, and then

show how to use statistical learning theory to analyze their theoretical properties.

When writing this chapter, we have referred to [ 1 - 5 ] to a large extent. Note that

we will not add explicit citations in the remaining part of this chapter. The readers

are highly encouraged to read the aforementioned materials since this chapter is just

a quick review of them.

In general, we use x i to denote the input variables, usually represented by fea-

tures, and y i to denote the output or target variables that we are going to predict.

A pair (x i ,y i ) is called a training example, and the set of n training examples

{

(x i ,y i )

;

i

=

1 ,...,n

}

is called a training set. We use

X

to denote the space of

input variables, and

the space of output values.

In supervised learning, given a training set, the task is to learn a function h

Y

:

X → Y

such that h(x) is a good predictor for the corresponding value of y .The

function h is called a hypothesis.

When the target variable that we are going to predict is continuous, the learning

problem is called a regression problem. When y can take on only a small number of

discrete values (such as 0 or 1), it is called a classification problem.

22.1 Regression

22.1.1 Linear Regression

Here we take linear regression as an example to illustrate the regression problem. In

linear regression, the hypothesis takes the following linear form:

h(x) = w T x.

Search WWH ::

Custom Search

Home