Information Technology Reference
In-Depth Information
Using Bayesian method to learn the structure and probabilities of Bayesian
network from prior information and sample information so that to construct a
whole Bayesian network opens an avenue for applying Bayesian network to data
mining and knowledge discovery. Compared with other data mining methods,
such as rule-based method, decision tree and artificial neural network, Bayesian
network has the following characteristics:
(1) It can integrate prior and posterior information, so as to avoid the subjective
bias when using merely prior information, to avoid the large blind searching and
computation when sample is lack, and to avoid the influence from noise when
using only posterior information. As long as prior is determined properly, we can
perform effective learning, especially when sample is hard or costly to gain.
(2) It can handle incomplete data set.
(3) It can explore casual relations in data.
(4) There are mature and effective algorithms. Although probabilistic reasoning
is NP hard for any arbitrary Bayesian network, in many practical problems, these
operations can be either simplified by adding some constrains or solved by some
approximation methods.
Yet, the computation of Bayesian network is huge. The Bayesian network seems
less efficient than some other methods if a problem is also be resolved by other
efficient approaches. Although there are some methods for prior determination,
which is extremely important when sample is hard to get, in practice, to find a
reasonable prior involving many variables is still a hard problem. Besides,
Bayesian network requires many assumptions as precondition. There are no
ready rules to judge whether a practical problem satisfies the assumptions or not.
These are problems deserve further study. Still, it can be predicted that in data
mining and knowledge discovery, especially in data mining with probabilistic
statistical features, Bayesian network will become a powerful tool.
6.6 Bayesian Latent Semantic Model
With the prevalence of Internet, Web information is increasing in exponential
way. It has been a research focus of Web information processing that how to
organize the information reasonably, so that to find expected target in massive
web data, and how to effectively analyze the information so that to mine new and
latently useful pattern in massive web data. The classification of Web
information is an effective approach for improving searching effectiveness and
efficiency. For example, when searching with Web search engine, if the class
information of query is available, the searching sphere will be limited and the
Search WWH ::




Custom Search