Databases Reference
In-Depth Information
of Y . Each variable is conditionally independent of its nondescendants in the graph, given
its parents.
Figure 9.1 is a simple belief network, adapted from Russell, Binder, Koller, and
Kanazawa [RBKK95] for six Boolean variables. The arcs in Figure 9.1(a) allow a rep-
resentation of causal knowledge. For example, having lung cancer is influenced by a
person's family history of lung cancer, as well as whether or not the person is a smoker.
Note that the variable PositiveXRay is independent of whether the patient has a family
history of lung cancer or is a smoker, given that we know the patient has lung cancer. In
other words, once we know the outcome of the variable LungCancer , then the variables
FamilyHistory and Smoker do not provide any additional information regarding Posi-
tiveXRay . The arcs also show that the variable LungCancer is conditionally independent
of Emphysema , given its parents, FamilyHistory and Smoker .
A belief network has one conditional probability table (CPT) for each variable.
The CPT for a variable Y specifies the conditional distribution P
.
Y j Parents
.
Y
//
, where
Parents
are the parents of Y . Figure 9.1(b) shows a CPT for the variable LungCancer .
The conditional probability for each known value of LungCancer is given for each pos-
sible combination of the values of its parents. For instance, from the upper leftmost and
bottom rightmost entries, respectively, we see that
.
Y
/
P
.
LungCancer D yes j FamilyHistory D yes , Smoker D yes
/D 0.8
P
.
LungCancer D no j FamilyHistory D no , Smoker D no
/D 0.9.
, Y n ,
respectively. Recall that each variable is conditionally independent of its nondescen-
dants in the network graph, given its parents. This allows the network to provide a
complete representation of the existing joint probability distribution with the following
equation:
Let X D.
x 1 ,
:::
, x n /
be a data tuple described by the variables or attributes Y 1 ,
:::
n Y
P
.
x 1 ,
:::
, x n /D
P
.
x i j Parents
.
Y i //
,
(9.1)
i D1
where P
.
x 1 ,
:::
, x n /
is the probability of a particular combination of values of X , and the
values for P
correspond to the entries in the CPT for Y i .
A node within the network can be selected as an “output” node, representing a class
label attribute. There may be more than one output node. Various algorithms for infer-
ence and learning can be applied to the network. Rather than returning a single class
label, the classification process can return a probability distribution that gives the prob-
ability of each class. Belief networks can be used to answer probability of evidence
queries (e.g., what is the probability that an individual will have LungCancer , given that
they have both PositiveXRay and Dyspnea ) and most probable explanation queries (e.g.,
which group of the population is most likely to have both PositiveXRay and Dyspnea ).
Belief networks have been used to model a number of well-known problems. One
example is genetic linkage analysis (e.g., the mapping of genes onto a chromosome). By
casting the gene linkage problem in terms of inference on Bayesian networks, and using
.
x i j Parents
.
Y i //
 
Search WWH ::




Custom Search