Database Reference
In-Depth Information
FIGURE 8.2
: Illustration of dependencies of variables in the hierarchical
model. The rating,
y
,foradocument,
x
, is conditioned on the document
and the user model,
w
m
, associated with the user
m
. Users share information
about their models through the prior, Φ = (
μ,
Σ).
The Bayesian hierarchical modeling approach has been widely used in real-
world information retrieval applications. Generalized Bayesian hierarchical
linear models, a simple set of Bayesian hierarchical models, are commonly
used and have achieved good performance on collaborative filtering (67) and
content-based adaptive filtering (76) (74) tasks. Figure 8.2 shows the graph-
ical representation of a Bayesian hierarchical model. In this graph, each user
model is represented by a random vector
w
m
. Assume a user model is sam-
pled randomly from a prior distribution
P
(
w|
Φ). The system can predict the
user label
y
of a document
x
given an estimation of
w
m
(or
w
m
's distribution)
using a function
y
=
f
(
x, w
). The model is called generalized Bayesian hier-
archical linear model when
y
=
f
(
w
T
x
) is any generalized linear model such
as logistic regression, SVM, and linear regression. To reliably estimate the
user model
w
m
, the system can borrow information from other users through
the prior Φ = (
μ,
Σ).
Now we look at one commonly used model where
y
=
w
T
x
+
,where
N
(0
,σ
2
) is a random noise (67) (76). Assume that each user model
w
m
is an independent draw from a population distribution
P
(
w
∼
Φ), which is
governed by some unknown hyperparameter Φ. Let the prior distribution of
user model
w
be a Gaussian distribution with parameter Φ = (
μ,
Σ), which
is the commonly used prior for linear models.
μ
=(
μ
1
,μ
2
, ..., μ
K
)isa
K
dimensional vector that represents the mean of the Gaussian distribution, and
Σ is the covariance matrix of the Gaussian. Usually, a Normal distribution
N
(0
,aI
) and an Inverse Wishart distribution
P
(Σ)
|
|
−
2
b
exp(
2
c
tr(Σ
−
1
))
are used as hyperprior to model the prior distribution of
μ
and Σ respectively.
1
∝|
Σ
−
Search WWH ::
Custom Search