Database Reference
In-Depth Information
TABLE 3.3: Accuracy results for the CiteSeer dataset. CC
algorithms significantly outperformed their CO counterparts except for
ICA-NB and GS-NB for matched cross-validation. CO and CC algorithms
based on LR outperformed the NB versions, but the differences were not
significant. ICA-NB outperformed GS-NB significantly for SS; but, the
rest of the differences between LR versions of ICA and GS, LBP and MF
were not significant.
Algorithm
SS
RS
M
0 . 7427
0 . 7487
0 . 7646
CO-NB
ICA-NB
0 . 7540
0 . 7683
0 . 7752
GS-NB
0 . 7596
0 . 7680
0 . 7737
CO-LR
0 . 7334
0 . 7321
0 . 7532
ICA-LR
0 . 7629
0 . 7732
0 . 7812
GS-LR
0 . 7574
0 . 7699
0 . 7843
LBP
0 . 7663
0 . 7759
0 . 7843
MF
0 . 7657
0 . 7732
0 . 7888
used and we did not have to tune the initializations for these two algorithms.
They were the easiest to train and test among all the collective classification
algorithms evaluated.
Third, ICA and GS produced very similar results for almost all experiments.
However, ICA is a much faster algorithm than GS. In our largest dataset,
CiteSeer, for example, ICA-NB took 14 minutes to run while GS-NB took
over 3 hours. The large difference is due to the fact that ICA converges in just
a few iterations, whereas GS has to go through significantly more iterations
per run due to the initial burn-in stage (200 iterations), as well as the need
to run a large number of iterations to get a suciently large sampling (800
iterations).
3.7 Related Work
Even though collective classification has gained attention only in the past
five to seven years, the general problem of inference for structured data has
received attention for a considerably longer period of time from various re-
search communities including computer vision, spatial statistics and natural
language processing. In this section, we attempt to describe some of the work
that is most closely related to the work described in this article; however,
due to the widespread interest in collective classification our list is sure to be
incomplete.
One of the earliest principled approximate inference algorithms, relaxation
labeling (13), was developed by researchers in computer vision in the context of
 
Search WWH ::




Custom Search