Information Technology Reference
In-Depth Information
likelihood providing a good compromise between convergence velocity and
computational payload. The final results of convergence were consistent for all the
supervision ratios and the log-likelihood results were ordered by sr. In addition, we
verified that the distances (measured by MSE) between the estimated centroids
b k k ¼ 1...3 and the original centroids of the ICA mixtures decrease with higher
supervision.
The above results demonstrate that the perturbation introduced by r k ð i Þ k ¼ 1...K
due to unlabelled data affects the convergence properties in the learning of the class
parameters. This residual increment affects the cases with the lowest supervisions the
most. For the highest supervisions, the convergence depends on the algorithm used to
update the ICA parameters of the classes, as discussed in Sect. 3.3.5 .
The classification and BSS results for sr 3 achieved the correct solution, and
the results for sr\3 were close to the correct solution. The maximum difference
for different sr was the difference between the unsupervised case and the super-
vised case (0.176 log-likelihood, 8.6 dB SIR, 28.3 % classification accuracy).
These parameters can be relatively critical depending on the specific application,
and they underscore the importance of incorporating semi-supervised learning in
ICAMM in order to take advantage of a partial labelling of the data (see Sect.
3.4.5 ). In addition, we repeated this experiment, but we changed the embedded
ICA algorithm using standard algorithms such as JADE and FastIca for parameter
updating. In general, the results of SIR, and classification accuracy were comparable
for all the embedded algorithms. The efficiency of JADE and FastIca in separation of
super-gaussian sources is known; for this kind of sources, the kernel density
estimation obtained similar results. However, for the log-likelihood results, the non-
parametric Mixca converged to highest values in a range of sr (0.3-1) while Mixca-
JADE and Mixca-FastIca only converged to the highest values of log-likelihood for
the supervised case. Thus, for these latter algorithms, more cases of middle-
log-likelihood were obtained.
In the second experiment we measured the approximate number of observation
vectors required by the Mixca procedure to achieve particular mean SIRs. A total
of 400 Monte Carlo simulations were generated with the following parameters: (i)
Number of classes in the ICA mixture K ¼ 2; (ii) Number of observation vectors
per class N ¼ 100 ; 200, 300, 400, 500; (iii) Number of sources = 4 (Laplacians
with a sharp peak at the bias and heavy tails); (iv) Supervision ratio (sr) = 0, 0.1,
0.3, 0.5, 0.7, 1; (v) Embedded ICA algorithm = Non-parametric Mixca.
Figure 3.6 a, b show the detailed and mean results of the second experiment.
Different graphs of the SIR obtained for different numbers of observation vectors
that correspond to the supervision ratios used in the training stage are depicted in
Fig. 3.6 b. The number of observation vectors required to obtain a particular SIR
value increased with less supervision. In general, the results demonstrate that the
non-parametric Mixca procedure is able to achieve a good quality of SIR with a
small number of observation vectors, e.g., 20 dB of SIR were obtained with only
203, 215, 231, 258, 306 observation vectors for sr = 1, 0.7, 0.5, 0.3, 0.1,
respectively. The results of this experiment confirm that the convergence effi-
ciency of the proposed procedure increases significantly when only a small
Search WWH ::




Custom Search