Information Technology Reference
In-Depth Information
Web page with the leant classifiers. Yet the acquisition of labeled training data is
often costly and fussy. Web page clustering, which can cluster documents
according to some similarity metric, can help to improve the retrieval. The
problem is that solution search of traditional clustering methods is somewhat
blind and lacks semantic meaning. Thus the effect of clustering is usually
unsatisfied. In this section, we proposed a semi-supervised learning algorithm.
Under the framework of Bayesian latent semantic model, the new algorithm uses
no labeled training data but only a few latent class/theme variables to assign
documents to corresponding class/theme. The algorithm includes two stages. In
the first stage, it applies Bayesian latent semantic analysis to label documents,
which contain latent theme variable(s); in the second stage, it uses naïve
Bayesian model label the documents without latent theme with the knowledge
information in these documents. Experimental results demonstrate that the
algorithm achieves high precision and recall. We will further investigate related
issues, such as the influence of latent variable selection on the clustering result
and how to implement word clustering under the framework of Bayesian latent
semantic analysis.
Exercises
1. Please explain conditional probability, prior probability and posterior
probability.
2. Please describe Bayesian Formula and explicate its significance thoroughly.
3. Please describe some criterions for prior distribution selection.
4. What does 'Naïve' mean in Naïve Bayesian classification? Please briefly
state the main ideas for improving Naïve Bayesian classification.
5. Please describe the structure of Bayesian network and its construction, and
exemplify the usage of Bayesian network.
6. What is semi-supervised text mining? Please describe some applications of
Bayesian model in Web page clustering.
7. In recent years, with the development of Internet technology, Bayesian rules
are widely applied. Please exemplify two concrete applications of the
Bayesian rules and explain the results.
Search WWH ::




Custom Search