Database Reference
In-Depth Information
3.3.2 Gibbs Sampling
Gibbs sampling (GS) (12) is widely regarded as one of the most accurate
approximate inference procedures. It was originally proposed in (10) in the
context of image restoration. Unfortunately, it is also very slow and one of the
common issues while implementing GS is to determine when the procedure
has converged. Even though there are tests that can help one determine
convergence, they are usually too expensive or complicated to implement.
Researchers in collective classification (29; 22; 24) have developed a version
of Gibbs sampling that is easy to implement and faster than traditional GS.
The basic idea behind this algorithm is to assume, just like in the case of
ICA, that we have access to a local classifier f that can sample for the best
label estimate for Y i given all the values for the nodes in
N i . We keep doing
this repeatedly for a fixed number of iterations (a period known as “burn-
in”). After that, not only do we sample for labels for each Y i ∈Y
but we
also maintain count statistics as to how many times we sampled label l for
node Y i . After collecting a predefined number of such samples we output the
best label assignment for node Y i by choosing the label that was assigned the
maximum number of times to Y i while collecting samples. The pseudo-code
for GS is shown in Algorithm 2 . For all our experiments that we report later,
we set burn-in to 200 iterations and collected 800 samples.
3.3.3 Local Classifiers and Further Optimizations
One of the benefits of both ICA and GS is the fact that it is fairly simple to
make use of any local classifier. Some of the classifiers used included: na ıve
Bayes ((7; 28)), logistic regression ((21)), decision trees (14) and weighted-
vote ((22)). There is some evidence to indicate that discriminatively trained
local classifiers such as logistic regression tend to produce higher accuracies
than others; this is consistent with results in other areas.
Other aspects of ICA that have been the subject of investigation include the
ordering strategy to determine in which order to visit the nodes to relabel in
each ICA iteration. There is some evidence to suggest that ICA is fairly robust
to a number of simple ordering strategies such as random ordering, visiting
nodes in ascending order of diversity of its neighborhood class labels and
labeling nodes in descending order of label confidences (11). However, there
is also some evidence that certain modifications to the basic ICA procedure
tend to produce improved classification accuracies. For instance, both (28)
and (24) propose a strategy where only a subset of the unobserved variables are
utilized as inputs for feature construction. More specifically, in each iteration,
they choose the top-k most confident predicted labels and use only those
unobserved variables in the following iteration's predictions, thus ignoring
the less confident predicted labels. In each subsequent iteration they increase
the value of k so that in the last iteration all nodes are used for prediction.
Search WWH ::




Custom Search