Database Reference
In-Depth Information
Algorithm 1
Iterative classification algorithm
Iterative Classification Algorithm (ICA)
for
each node
Y
i
∈Y
do
{
bootstrapping
}
{
c
}
ompute label using only observed nodes in
N
i
compute
a
i
using only
X∩N
i
y
i
←
f
(
a
i
)
repeat
{
iterative classification
}
generate ordering
O
over nodes in
Y
for
each node
Y
i
∈O
do
{
ompute new estimate of
y
i
compute
a
i
using current assignments to
c
}
N
i
f
(
a
i
)
until
all class labels have stabilized or a threshold number of iterations
have elapsed
y
i
←
3.3.1 Iterative Classification
The basic premise behind ICA is extremely simple. Consider a node
Y
i
∈Y
whose value we need to determine and suppose we know the values of all the
other nodes in its neighborhood
N
i
can contain both observed
and unobserved variables). Then, ICA assumes that we are given a local
classifier
f
that takes the values of
N
i
(note that
N
i
as arguments and returns a label value
for
Y
i
from the class label set
. For local classifiers
f
that do not return
a class label but a goodness/likelihood value given a set of attribute values
and a label, we simply choose the label that corresponds to the maximum
goodness/likelihood value; in other words, we replace
f
with argmax
l∈L
L
f
.
This makes the local classifier
f
an extremely flexible function and we can use
anything ranging from a decision tree to an SVM in its place. Unfortunately,
it is rare in practice that we know all values in
N
i
which is why we need
to repeat the process iteratively, in each iteration, labeling each
Y
i
using the
current best estimates of
N
i
and the local classifier
f
, and continuing to do
so until the assignments to the labels stabilize.
Most local classifiers are defined as functions whose argument consists of
one fixed-length vector of attribute values. A common approach to circumvent
such a situation is to use an aggregation operator such as
count
,
mode
,or
prop
,
which measures the proportion of neighbors with a given label.
Algorithm 1 depicts the ICA algorithm as pseudo-code where we use
a
i
to
denote the vector encoding the values in
N
i
obtained after aggregation. Note
that in the first ICA iteration, all labels
y
i
are undefined and to initialize
them we simply apply the local classifier to the observed attributes in the
neighborhood of
Y
i
; this is referred to as “bootstrapping” in Algorithm 1.
Search WWH ::
Custom Search