Biology Reference
In-Depth Information
functions for sequence management and analysis, statistics, numerics,
graphics, parallel execution, etc. The closed-source kernel is coded in C
language, including time-critical library functions (as LST). The library
functions that are programmed in the interpreted language are open
source i under the MIT License.
LST is a CHE procedure; that is, it constructs an initial tree and applies
heuristics until the evaluator shows that the tree cannot be improved any
further. The input to LST is a distance and a variance matrix, and the eval-
uator is the Euclidean norm of the weighted errors between the input and
the actual tree. Thus, the function it minimizes is
(
)
2
Td
-
ij
ij
Â
.
2
s
ij
ij
A missing distance between two objects A and B can be indicated
with
. LST uses a randomized version of WPGMA (Weighted
pair group method using arithmetic averages). WPGMA is like UPGMA,
except that the computation of the distances to a joined subtree is done
taking the variances into account. The new variances also have to be com-
puted. When joining subtrees/nodes A and B
σ
2
AB
=∞
S
X
R
l A
l B
A
B
we use the following formulas: compute l A and l B from the equations
l
+=
l
d
AB
B
d
-
+
d
1
Â
Â
AX
BX
l
-=
l
AB
2
2
2
2
ss ss
+
XS
Œ
AX
BX
XS
Œ
AX
BX
i The source code of the library and executables of the kernel (for Linux, Mac OS X,
Solaris, and Irix) are available for download at http://www.cbrg.ethz.ch/.
 
Search WWH ::




Custom Search