Database Reference
In-Depth Information
where M(.) is the count of the particular instantiation in the data set. In
essence, the data description length evaluates the proximity of the distribu-
tions implied by the data and the candidate network, which is a measure of
the accuracy of the candidate network.
Because the MDL metric is simply the sum of the two description lengths,
it puts a balance between model complexity and model accuracy. In other
words, the optimal network, with regard to the metric, should be simple and
accurately represent the joint distribution.
As a property common to other metrics, the MDL metric is node-
decomposable and could be written as in Eq. (6.5). One can observe that
the score is simply the sum of the independent evaluation on the parent set,
Π N i ,ofeverynodeN i in the domain U:
MDL(G)=
N i ∈U
MDL(N i N i ).
(6.5)
With the defined metric, the network learning problem can be formulated
as a search problem. The objective is to search for the network structure
that has the optimal score. However, the problem is di cult as the search
space, which contains all possible network structures, is huge. Chickering et al.
proved that the search problem is NP-hard with the use of a particular met-
ric [6.21]. Some research, therefore, resorts to greedy search heuristics [6.22],
[6.20]. However, the drawback of these approaches is that suboptimal solu-
tions may be obtained. Some others use systematic and exhaustive search,
like branch-and-bound [6.23], to find the optimal solution. In the worst case,
the time consumed would be considerable. Recently, some researchers at-
tempt [6.24], [6.9] to use evolutionary computation to tackle the problem.
6.2.2 Evolutionary Computation
Evolutionary computation is a general stochastic search methodology. The
principal idea is borrowed from evolution mechanisms proposed by Charles
Darwin. Evolutionary computation is becoming popular as it often gives sat-
isfactory results for various optimization problems in different areas. For ex-
ample, it is applied in data mining, image processing, pattern recognition,
and signal processing [6.25], [6.26], [6.27], [6.28], [6.29], [6.30].
In essence, evolutionary computation is a group search algorithm with
guidance. In Fig. 6.2, we show the typical steps during searching. A can-
didate solution in the search space is called a chromosome. A chromosome
consists of a number of genes, which correspond to the elements constituting
a solution. At the beginning, a pool of chromosomes, also called the popula-
tion, is created randomly. As such, a number of search points are maintained.
For each generated individual, the fitness value, which stands for the quality
of the candidate solution it encodes, is computed according to a predefined
fitness function. In subsequent iterations, or generations, new chromosomes
Search WWH ::




Custom Search