Information Technology Reference
In-Depth Information
(1) There is no missing data in sample D, or D is complete;
(2) Parameter
vectors
are
mutually
independent,
viz.
q
n
∏∏
h
h
p
(
θ
|
S
)
p
(
θ
|
S
)
=
. This is called parameter independence.
s
ij
i
=
1
j
=
1
Under the above assumptions, for given random sample D, parameters are
independent:
q
n
i
∏∏
=
h )=
h )
p
(
θs| D
,
S
p
( ȶ ij |
D
,
S
(6.37)
i
1
j
=
1
Then we can update each parameter vector ȶ ij independently. Suppose each
parameter ȶ ij
has
the
prior
distribution
of
Dirichlet
distribution
α
α
α
Dir( ȶ ij |
,
, ? ,
), we get the posterior distribution:
ij
1
ij
2
ijr
i
h )=
α
N
α
N
α
N
p
( ȶ ij |
D
,
S
Dir
( ȶ ij |
+
,
+
, ? ,
+
)
(6.38)
ij
1
ij
1
ij
2
ij
2
ijr
ijr
i
i
= pa i j .
Now we can make interested prediction by seeking the mean of possible
x i k and Pa i
where
N ijk is the number of cases in D that satisfy
X i =
θ s .
r
=
θ
h )=
(
θ
)
For example, for the
N
+1th case,
p
(
x N +1 |
D,S
. According to
ijk
h
p
(
|
D S
,
)
i
1
s
the parameter independence given D, we can calculate the expectation:
n
n
=
= Ð
Ð
h )=
h )d ȶ =
h )d ȶ ij
θ
θ
p
(
x N +1 |
D,S
· p
(
θ s |
D,S
· p
( ȶ ij |
D,S
ijk
ijk
i
1
i
1
and finally get:
α
+
N
n
ijk
=
ijk
h )=
p
(
x N+1 |
D
,
S
(6.39)
α
+
N
ij
ij
i
1
à =
r
k
à =
r
k
where
1 . Because unrestricted multinomial
distribution belongs to exponential family, the above computation is rather easy.
A Bayesian network with respect to variables X represents the joint distribution
of X . So, no matter a Bayesian network is constructed from prior knowledge, or
data, or integration of them, in principle, it can be used to deduce any interested
probability. Yet, the precise or even approximately precise reasoning on a
Bayesian network with discrete variables is NP hard. Current solution is to
simplify computation based on some conditional independence, or to construct
simple network topology for some specific reasoning problem, or to simplify the
network structure at the cost of less precision loss. Even though, it often requires
considerable computation to construct a Bayesian network. For some problem,
such as naïve Bayesian classification, using conditional independence can largely
reduce computation without losing much precision.
α
=
i
1 α
N
=
i
N
and
ij
ijk
ij
ijk
Search WWH ::




Custom Search