Biology Reference
In-Depth Information
making no assumptions about the densities, and using the same formulas
in all cases.
(b) In each EM iteration the density functions must be evaluated, requiring
KN function evaluations (in Step 1) where K is the number of densities
in the mixture. In comparison, the PDQ iterations are cheaper, requiring
no function evaluations.
(c) Because the EM iterations are costly, it is common to use another
method, e.g., the K -means method, as a pre-processor, to get closer
to the centers before starting EM. The PDQ Algorithm needs no such
switch, and works well from a cold start.
(d) If correct assumptions are made regarding the mixing distributions, then
the EM method has an advantage over the PDQ method, as illustrated in
Example 2.6 below.
(e) While the numerical comparison between the two algorithms should best
be left to others, our preliminary tests show the two algorithms to be
roughly equivalent in terms of the returned results, with the PDQ Algo-
rithm somewhat faster.
2.5. Numerical Experiments
In Examples 2.4-2.6 below the PDQ and EM Algorithms were applied to the
same data, in order to compare their performance.
The results are reported in
Tables 2.1-2.4.
These examples are typical representatives of many numerical
tests we did.
Both programs used here were written in MATLAB, the EM code by Tsui [16],
and the PDQ code by the first author.
The comparison is subject to the following limitations:
(a) The EM program code uses the K -means method (Hartigan [5]) as a
preprocessor to get a good start. The number of iterations and running
time reported for this program in Table 2.4 is just for the EM part, not
including the time for the K -means part.
(b) Our PDQ code is the first un-optimized, un-finessed version, a verbatim
implementation of Algorithm 1.
(c) The number of iterations depends on the stopping rule. In the PDQ Al-
gorithm, the stopping rule is Step 4 of Algorithm 1, and the number of
iterations will increase the smaller is . In the EM Algorithm the stop-
ping rule does involve also the convergence of the likelihood function,
and the effect of the tolerance is less pronounced.
Search WWH ::




Custom Search