Information Technology Reference
In-Depth Information
An RBM assigns an energy to every configuration of visible and hidden state
vectors, denoted
v
and
h
, respectively. For binary visible units, an RBM with
V
visible units and
H
hidden units is governed by the energy function
X
V
X
H
X
V
X
H
v
i
b
i
h
j
b
j
;
E.v;h/ D
v
i
h
j
w
ij
(19.1)
iD1
jD1
iD1
jD1
where
v
i
and
h
j
are the binary states of visible unit
i
and hidden unit
j
,
b
i
and
b
j
are their biases, and
w
ij
is the weight between them.
Under this energy function, the conditional probabilities for each visible and
hidden unit given the others are
!
b
j
C
X
i
p.h
j
D 1j
v
/ D g
v
i
w
ij
(19.2)
0
@
b
i
C
X
j
1
A
p.v
i
D 1j
h
/ D g
h
j
w
ij
(19.3)
where
1
1 C e
x
g.x/ D
(19.4)
is the logistic or sigmoid function.
The network assigns a probability to every possible joint configuration
.v;h/
via
the energy function as
p.v;h/ D
e
E.v;h/
Z
e
E.v;h/
u
;g
e
E.
u
;g/
;
D
(19.5)
where
Z
is called the partition function. The marginal distribution of the visible
units is then given as
p.v/ D
X
h
p.v;h/
(19.6)
and the gradient of the average log-likelihood is
@
log
p.v/
@
w
ij
Dhv
i
h
j
i
0
hv
i
h
j
i
1
:
(19.7)
Search WWH ::
Custom Search