Biology Reference
In-Depth Information
The second part measures how well leaves X outside S fit with respect to
all of the leaves in S . This is
(
)
2
TT d
+
-
Â
.
AR
RX
AX
EX
()
=
2
2
s
AS
Œ
AX
For each X , we will assume that we are free to choose its optimal distance
to R . (If this is not the case, it is the “fault” of some other part of
the tree, not the “fault” of S .) T RX is then set so that E 2 ( X ) is minimal
(which is equivalent of setting T RX by LS for each S and each X ).
Combining the two parts (thereby considering all leaves X outside S ), we
get as an intermediate result the total sum of the weighted squared
errors:
Â
2 ( .
EE
¢ =+
Œ
EX
S
1
XS
The number of degrees of freedom in the computation of E
S is
k
= Ê
ˆ
˜ +-- -++-
(
)
p
kn
(
k
)
(
231
k
)
(
n
k
)
Á
,
12
44443
4444
2
12
443
44
# branches to be set
#
distance relations
where k
=
| S | is the number of leaves in the subtree S . Clearly, k
>
1 for
this to make sense. Normalizing E
S by the degrees of freedom, we obtain
the final index
(
)
Â
2
E
+
--
E
()
X
E
p
¢
1
2
.
s
X
Œ
S
E
=
=
s
(
2
nk
4
)(
k
-
1
)
For each subtree, we evaluate E S and choose the one with the small-
est E S . This is our best-fitting subtree, and we will now assume that it can
 
Search WWH ::




Custom Search