Biomedical Engineering Reference
In-Depth Information
( a )
Mean relative subproblem size
100
Rec−DCM3
DCM3
DCM2
90
80
70
60
50
40
30
20
10
0
1
2
3
4
5
6
7
8
9
10
Dataset#
( b )
Number of subproblems
15
Rec−DCM3
DCM3
DCM2
10
5
0
1
2
3
4
5
6
7
8
9
10
Dataset#
Figure 16.7 Comparison of DCM2, DCM3, and Recursive-DCM3 decompositions. DCM2
decompositions on datasets 5 to 10 could not be computed due to memory limitations.
16.4.1.4 Subtree Construction and Assembly Once the data set is decom-
posed into overlapping subsets A 1 , A 2 , ... , A m (for us, m
4 is typical), subtrees are
constructed for each subset, A i , using the chosen “base method,” and then combined
using the Strict Consensus Merger [31, 32] to produce a tree on the combined data
set. The proof that the resulting tree is accurate (i.e., agrees, with high probability
and in the limit, with the unknown underlying “true tree”) is similar to the argument
shown in Ref. [31].
Our Rec-I-DCM3 algorithm takes as input the set S
of n aligned
biomolecular sequences, the chosen base method, and a starting tree T . In our experi-
ments, we have usedTNT (with default settings) as our basemethod, as it is the hardest
to improve (in comparison, the PAUP* implementation of the parsimony ratchet [36]
is easier to improve). Our algorithm produces smaller subproblems by recursively
applying the centroid-edge decomposition until each subproblem is of size at most
k . The subtrees are then computed, merged, and resolved (from the bottom-up, using
={
s 1 , ... , s n }
 
Search WWH ::




Custom Search