Database Reference
In-Depth Information
Quoting from [6.5]: “In general, two variables X and Y may be conditionally
dependent given a set Z while independent on the subset or superset of Z.”
In the worst case, like in SGS, every possible combination of the conditioning
set should be examined, which would require an exponential number of tests.
Second, results from CI tests may not be reliable for high-order CI tests (when
the size of the conditioning set is high) [6.5], [6.15]. Hence, for algorithms that
require high-order CI tests, the results may be inaccurate. Third, because a
network is constructed in a step-by-step manner, the construction algorithm
may be unstable in the sense that an earlier mistake during construction is
consequential [6.5], [6.16]. Moreover, this suggests that the order of testing
the CI relations is important, which will be a concern when one pursues
optimal performance.
The Search-and-Scoring Approach. The second approach is called the
search-and-scoring approach. Recall that a Bayesian network encodes a joint
probability distribution (Eq. 6.2); we can derive a measure to assess the good-
ness of such encoding. For instance, such a measure could be derived from
Bayesian statistics, information theory, or the minimum description length
(MDL) principle [6.17]. Though their theoretical foundations are different,
some studies [6.18], [6.19] show that different metrics are asymptotically
equivalent under certain conditions.
BecauseweemploytheMDLmetric[6.20]inourwork,wetakeitas
an example for illustration. Basically, the metric is derived from information
theory and incorporates the idea of the minimum description length principle.
The metric has two components: the network description length and the
data description length. An optimal network is the one that minimizes both
simultaneously.
Formally, let U =
be the set of discrete variables, Π N i be
the parent set of a node N i in the candidate network, and v i be the number
of possible states of the variable N i . The network description length is given
by
{
N 1 ,...,N n }
n
1)
N j ∈Π N i
|
,
Π N i |
log
(n)+d(v i
v j
2
i=1
where d is a constant denoting the number of bits used to store a numerical
value. Intuitively, the network description length represents the structural
complexity of the network, which is evaluated by the number of bits required
to encode the graphical structure and store the conditional probability table
at each node.
The data description length is given by
n
M(Π N i )
M(N i N i ) ,
M(N i N i )log 2
i=1
N i N i
Search WWH ::




Custom Search