A Weight-Sharing Gaussian Process Model Using Web-Based Information for Audience Rating Prediction - Technologies and Applications of Artificial Intelligence

Information Technology Reference

In-Depth Information

A Gaussian process is completely defined by its second-order statistics. The mean

function and the covariance function of a Gaussian process can be defined as follows:

E

(2)

, E

(3)

and ~ ,, . Without loss of generality, it is common to consider

GPs with mean function 0 to simplify the derivation. In this case, a GP is

fully specified given its covariance function.

Since it is infeasible to consider all possible random functions, certain assumptions

must be made when making inference. By restricting the underlying function to be

distributed as a GP, the number of choices is reduced. Furthermore, a Gaussian

predictive distribution can be derived in closed-form under such assumption. If we

consider only zero-mean Gaussian processes, then for a test input , the mean and

variance of the predictive distribution can be computed as follows [16]:

(4)

,

(5)

where , ,…, ,

, is the covariance matrix of training

input vectors with , , and , ,…, is a vector of training

target values. The point prediction is commonly taken to be the mean of the predictive

distribution, i.e. .

4.2

Weight-Sharing Kernel

The covariance function is also called the kernel of a GP. As previously mentioned, the

behavior of a zero-mean GP can be fully specified provided its covariance function. For

regression problems, we want to assign similar prediction values to two input vectors that

are close in space. In other words, if two similar time series are observed, the model

should be able to give similar predictions. A widely used kernel possessing this property

is the radial basis function (RBF) kernel, which is called a squared exponential (SE) ker-

nel in GP literature. It has the form

2

, exp 1

(6)

where is the -th dimension of vector and is the dimension of input vectors.

There are hyperparameters , ,… for this kernel. The hyperpara-

meters are called the characteristic length-scales . It serves as a distance measure

Search WWH ::

Custom Search

Home