Database Reference
In-Depth Information
We estimate the performance of the learning algorithms using five mea-
sures, which include:
the average MDL score of the final solutions, the smaller the better (AFS);
the average MDL score of the best network obtained in the first generation
(AIS);
the average execution time in seconds (AET);
the average generation that the best-so-far solution is obtained (ANG);
the average number of MDL metric evaluations in a run (AME);
the average structural difference, i.e., number of edges added, omitted, and
reversed, between the final solution and the original network (ASD).
Recall that the algorithms are executed 40 times for each data set; the
figures are, therefore, average values of 40 trials. Without any fine-tuning, we
adopt the parameter values as the default setting:
For MDLEP, we adopt the same parameter settings that appear in the
original publication [6.9]: the population size is 50, the tournament size is
seven, and the maximum number of generations is 5000.
For CCGA, the cutoff value for the CI test phase is 0.3. For each species
population in search phase, the population size is 20 with the crossover
and mutation rate set to 0.7 and 0.2, respectively. The belief factor is 0.2.
We use 1000 generations as the termination criterion.
6.5.2 Comparing CCGA with MDLEP
We provide a summary of the results in Table 6.2. In the table, the MDL
score of the original network is shown under the name of the data set for
reference. Besides the averaged measures, we include the standard deviations
of the respective measure, which appear in parentheses.
Except for the ASIA-1000 data set, we can observe that CCGA is often
able to find a better or as good network compare with MDLEP. For two of
the six cases, the difference is statistically significant at the 0.05 level using
the Mann-Whitney test. 11 Using the MDL score of the original network as
a reference, we observe that although MDLEP performs well (competitive
with CCGA) for smaller data sets, it clearly needs a longer running time to
compete with our approach for larger data sets. For the ALARM-O data set,
we regard it as a harder problem instance. The data set is relatively large,
and both algorithms fail to approximate the score of the original network.
However, CCGA still has better performance. For the PRINTD 5000-data
set, both algorithms could recover the original network structure and hence
the two have identical performance. For the ASIA 1000-data set, we find that
MDLEP outperforms CCGA in terms of the final score. On closer inspection,
we find that this is because an important edge in the network has been
cut away during the CI test phase. Consequently, CCGA is stuck at a local
11 The Mann-Whitney test is a nonparametric test that suits our need as the final
score is observed not to follow a normal distribution.
Search WWH ::




Custom Search