Phylogenetic Tree Reconstruction: Geometric Approaches - Mathematical Concepts and Methods in Modern Biology

Biology Reference

In-Depth Information

v ∈

V for which the outdegree of v is 0 (also sometimes called terminal vertices). A

tree may be rooted or unrooted ; by definition, if T is rooted, then there is a vertex

v o

which has indeg

0, and otherwise T is unrooted. An interior vertex is onewhich

is not a node, although if T is rooted and

(v o ) =

| >

2, the root is usually not regarded as an

interior vertex or as a leaf. A pair of two leaves

is a cherry ,if i and j have a com-

mon parent node, 2 say, a , i.e., there are edges starting at a and ending at i , respectfully,

j . More generally, any leaves sharing a common parent node are called sisters .

Alternatively, when working with unrooted trees, as we will most often do below

for reconstruction methods, one can dispense with directionality (orientation) and

start by defining trees simply as connected graphs (on nonempty vertex sets) with no

cycles or loops. In this case, edges are defined by pairs

{

}

{

}

of (distinct) elements

in V . It is common to represent an edge e

}∈

E simply as a line segment

−

b (or b

−

a ). One simply defines the degree deg

(v)

v ∈

V to be the number

of edges incident to

. Leaves are vertices

with deg

(v) =

1, and

v ∈

V is an

interior vertex if deg

for all interior vertices. Cherries are then pairs of leaves joined by two edges that

meet in an interior vertex

(v)

2. Binary trees are then trees for which deg

(v) =

is still the parent ,or ancestor ).

Sisters are then defined analogously. An unrooted tree can be rooted by picking a

leaf and adding an orientation directing every edge away from it; more commonly

for biological purposes, a node is added along some interior edge, the two new edges

split from it, as well as the remaining edges are, again, directed so that the node has

indegree zero. If T is a directed tree in the earlier sense, then upon “forgetting” the

directions of the edges, one gets an (undirected) tree in this new sense, and for any

vertex v of T

, with deg

(v) =

3 (and

. For unrooted trees, it is convenient

to dispense with orientation, but helpful to view unrooted trees as graphs that could

be directed and rooted as needed. We will assume unrooted trees are just graphs,

rather than directed graphs, and rooted trees are directed graphs. We may use the term

“directed tree” where there might be need or confusion.

There aremanyways of representing trees, andmany types of software available for

drawing and viewing trees. In the following exercise, we explore theNewick format for

representing rooted phylogenetic trees, along with some of the tree-drawing programs

using Newick-style input to output rooted and unrooted trees. We do so through free

access at the website, Phylogeny.fr at http://www.phylogeny.fr . This web-

site is specifically geared toward providing phylogenetic tools for nonspecialists. First,

we note that a compact representation for a rooted tree is to group sisters by parenthe-

ses, e.g., (A,(B,C)); represents the tree with cherry

deg

(v) =

indeg

(v) +

outdeg

(v)

and then a root joining

this cherry to A , as in Example 10.1 . Such a format is the Newick format or Newick

representation of the tree. (Note that the end semicolon in (A,(B,C)); is part of

the standard Newick format, and not meant to be punctuation in regular English.)

Example 10.1. The rooted tree given by (A,(B,C)); see Figure 10.1 .

{

}

2 We use this terminology although in a biological interpretation, interior nodes may not be viewed as

ancestors, but only as demarcation points for speciation.

Mathematical Concepts and Methods in Modern Biology

Search WWH ::

Custom Search

Home