IMFM
Institute of Mathematics, Physics and Mechanics Jadranska 19, 1000 Ljubljana, Slovenia
Preprint series Vol. 49 (2011), 1160
ISSN 2232-2094
COMPUTING QUADRATIC ENTROPY IN EVOLUTIONARY TREES
Drago Bokal Matt DeVos Sandi Klavzar Aki Mimoto Arne 0. Mooers
Ljubljana, September 16, 2011
£ CO CO
u
a
CD u
Computing quadratic entropy in evolutionary trees
a
Drago Bokal*
Faculty of Natural Sicences and Mathematics, University of Maribor,
vO
KoroSka cesta 160, SI-2000 Maribor, Slovenia
Matt DeVos
£
Department of Mathematics, Simon Fraser University, 8888 University Drive, Burnaby BC V5A 1S6, Canada
CM
I
Sandi Klavžar^
CO
Faculty of Mathematics and Physics, University of Ljubljana, Jadranska 19, 1000 Ljubljana, Slovenia and
Faculty of Natural Sciences and Mathematics, University of Maribor, Koroska 160, 2000 Maribor, Slovenia
Aki Mimoto} Arne 0. Mooers§
CD
Department of Biological Sciences, Simon Fraser University,
m
8888 University Drive, Burnaby BC V5A 1S6, Canada .
IATEX-ed: September 16, 2011
We note here that quadratic entropy, a measure of biological diversity introduced by Rao, is a variant of the weighted Wiener index, a graph invariant intensively studied in mathematical chemistry. This fact allows us to deduce some efficient algorithms for computing the quadratic entropy in the case of given tip weights, which may be useful for community biodiversity measures. Furthermore, on ultrametric phylogenetic trees, the maximum of quadratic entropy is a measure of pairwise evolutionary distinctness in conservation biology, introduced by Pavoine. We present an algorithm that maximizes
a
CD
CO
this quantity in linear time, offering a significant improvement over the currently used
vO
quadratic programming approaches.
i—l
Keywords: evolutionary tree, phylogenetic tree, quadratic entropy, originality, distinct-
rh
ness, Wiener index
1 Introduction
CM
Phylogenetic trees are simply graphs depicting the inferred relationships among predefined
CM
sets of leaves (which often correspond to species). This means that they are amenable to
fc analyses with graph theory [35]. If they are given a direction by ufe^g a r°°, w «n
speak about evolutionary trees. Their structure models evolution, which has a direction from
l-H
past to present, and which is generally (but not exclusively, see [2]) diversifying such that the simultaneous production of more than two descendent lineages from an ancestral lineage is rare. Biologists often consider internal vertices to represent extinct ancestral lineages and the edge lengths to represent the amount of evolution that occurred between the species
CD
U
CD CO
* Corresponding author: drago.bokal@uni-mb.si. Supported in part by the Ministry of Higher Education, science and Technology of Slovenia, Research Project L1-5014 and Research Program P1-0297.
t Supported in part by the Ministry of Science of Slovenia under the grant P1-0297. The first and the third author are also with the Institute of Mathematics, Physics and Mechanics, Jadranska 19, 1000 Ljubljana. *The last two authors are also with the Interdisciplinary Research in Computational and Mathematical . rH Sciences Center, Simon Eraser University
^Supported by a fellowship from the Wissenschaftskolleg zu Berlin, Wallotstrasse 19, 14193 Berlin, Gera many, and NSERC Canada
CD CU
a
CD CO
o
vO
evolutionary process model to discrete data measured on the leaves in a maximum likelihood
vO
or Bayesian framework [11]. Because evolutionary trees are representations of the evolutionary history of a set of contemporaneous leaves, they are often forced to be ultrametric, i.e.
¿Q
all leaves are equidistant from the root. Such an evolutionary tree has a height h, the sum of edge lengths on the path from the root to (any) leaf: edge lengths are then inferred to represent the relative elapsed time between internal vertices.
Past mathematical research has considered tree inference, tree shape distribution and accompanying generating models, as well as parameter estimation from inferred trees. So, Semple and Steel [35] summarize how graph theory can contribute to the NP-complete problem of evolutionary tree inference: e.g. what the mathematical properties are for characters that allow for the recovery of the underlying tree under different assumptions about their evolution, and how subtrees can be combined to best preserve their information. Explorations of evolutionary tree structure distributions have a long pedigree, with most attention
CM CM
CO also been related discussion on appropriate prior distributions (of tree topology and edge CM
lengths) for evolutionary tree inference [9, 40, 34], and efficient algorithms listing all possible evolutionary trees for a given set of species have been developed [33]. At the other end of
CO
the evolutionary tree inference cycle, mathematically-inclined biologists have produced tools for estimating evolutionary parameters (speciation and extinction rates) from inferred trees [16, 24, 27].
Graph theory can also bear on practical biological conservation. If we assume that one aspect of the leaves (species) humans would like to conserve is the unique information they
V
embody, and if we let the edge-weighted evolutionary trees represent the pattern of shared
CD
and unique information, we can start to devise approaches that maximize this quantity
focussed on the Yule [41] and uniform [32] distributions of topologies [23, 36]. There has
U
a
CD U
(called 'evolutionary history' or 'phylogenetic diversity') under constraints of final subset size, budgets for conservation, costs of conservation, and the probabilities of species survival
contribution of a leaf to future subsets on an evolutionary tree - these have been termed a
vO
a
CD CO
o
vo
£ CO
CO CD
$H
CD CO
leaf's 'originality' or 'distinctness' [12, 15, 17, 20, 25, 29, 31].
In this contribution, we explore one such measure, quadratic entropy, introduced by Rao
¿Q
[28] and recently applied to evolutionary trees by Pavoine [25, 26]. These authors propose computing the probability distribution p maximizing the quadratic entropy for general finite metric spaces. In the evolutionary context, Pavoine [25] specifically suggests using p as an importance score for species on a tree as a weight representing its expected pairwise contribution of evolutionary originality. The method can be interpreted as finding an optimum of a quadratic mathematical program, yielding an algorithm of complexity O(n4). It has been implemented as a function [4] in the ADE package for analysis of environmental data [1] within the statistical environment R [37]. However, one can use the specific structure of evolutionary trees to develop a linear time algorithm for maximizing the quadratic entropy in two depth-first traversals of the tree. Presenting this algorithm (implemented in R and
CM CM
CO ^^ we make a few additional observations on computing the quadratic entropy and CM
its connections with the graph invariant Wiener index [6, 8, 18, 21], widely known in mathe-
freely available from the first author, [3]) is the main goal of the present contribution. In
matical chemistry. This last connection also allows for an alternative and rapid (linear time)
CO algorithm of computing the quadratic entropy on evolutionary trees with known leaf weights.
2 Evolutionary trees and quadratic entropy
An evolutionary tree T = (T,r,w) consists of a tree T rooted at a vertex r E V(T) whose edges have their length determined by a function w : E(T) ^ R+. Between any two pairs of vertices u,v E V(T), there is a unique shortest path in T, and by the distance d(u,v) between u and v we denote the length of this path, i.e. the sum of the w-values of its edges.
In a rooted tree, every vertex v E V(T) has a unique incident edge ev that lies on the shortest
a
u
CD
a
CD CO
o
vD
path connecting v with r. The component of T — ev containing v is the subtree Tv rooted at
CM
v. Then, Tv = (Tv, v,w/E(T)) is the corresponding evolutionary subtree. The endvertex of
vD
ev distinct from v, is the parent of v, and all neighbors for which v is a parent are children of v.
Each vertex in an evolutionary tree represents a species in the history of Earth. A leaf vertex represents either a living species or an extinct species, and an internal vertex represents the common ancestral species of those corresponding to the vertices in V(Tv). The length of an edge uv represents the time elapsed between the species v, whose immediate ancestor is u, branched into two or (rarely) more new species. Therefore the living species are represented by the leaves at the largest distance h from r, called the height of T. In our study we assume that T contains no extinct leaf species, i.e. all the leaves of T are at distance h from r,
o
making T strictly ultrametric. The height corresponds to the age of the species represented by r. Note that, in the case that the root r has degree one, we do not consider it as a leaf of O T.
CM
CO ^ representing the distance among two ,_randomly selected leaves of T (with repetition). CM
Quadratic entropy E(D) is the expected value of this random variable [25, 26, 28]. We can define it for any metric space X, in which case we are evaluating the expected distance be-
CO
tween two randomly selected elements of X. Thus, if m is the vector of relative frequencies of elements from X (in our case, species) and A is the corresponding distance matrix (in our pL| case, the matrix of distances in the evolutionary tree), then
E(D) = mT A^.
Jh
Matrix multiplication from this formula yields a quadratic algorithm for computing E(D) for given m and A. Further, finding m that maximizes E(D) for given A corresponds to finding a maximum of a quadratic program in variable m and can be done using standard methods
&
of convex programming. In the special case of evolutionary trees, we develop a significantly
$H
Let m be a probability distribution on the leaves of T and let D denote the random vari-
O more efficient algorithms for both tasks, which run in linear time.
CM vO
3 Computing Quadratic Entropy
$H
CD
Suppose T = (Ti,ri,wi) and T2 = (T2,r2,w2) are two evolutionary trees. The join of these g two trees ,s the tree T = 71 + 72, T = (r,r, w), where 7 is obtained from T and T by
identifying their roots r1 and r2 into a new root r, and the length function w of T is induced
CD CO
by the functions w1 and w2.
For a subtree T' of T, let E(D\T') denote the expected value of D conditional to both
vO
leaves being selected from T' and let »(T') denote the sum of »(l) for all leaves l in T', i.e. the probability that a random leaf of T is a leaf of T'. The following proposition holds:
o
Proposition 1 Let T be a tree with probability distribution » on its leaves, and let T be the ^ join of two trees T1 and T2 of the same height h. Then,
o
CM
E(D) = E(D|Ti)MTi)2 + E(D|T2)MT2)2 + 4h»(Ti)»(T2 ).
00
CM Proof.
E(D) = £ »(l)»(l')d(l,l')
i,i'eT
= ^ »(l)»(l')d(l,l') + ^ »(l)»(l')d(l,l') +
i,i'eTi	i,i'eT2
^ »(l)»(l')d(l,l') + ^ »(l)»(l')d(l,l')
ieTi,i'eT2	ieT2,i'eTi
E(D\Ti)»(Ti)2 + E(D\T2 MT2)2 + 4h £ »(l) £ »(l')
CO CD
Jh
leTi I'eTz
= E(D|T)MT)2 + E(D|T2 )^(T2)2 + 4h^(T1)^(r2). We have essentially used the fact that all the leaves are at distance h from r, thus the distance
VD
u
CD
CD
a
CD CO
o
vD
0
o
1
CO ^
CO CO
m
CD
$H
CD
m
Recursive application of Proposition 1 yields a linear algorithm that computes E(D) for a given probability distribution p on the leaves of T. It is presented as Algorithm 1. It computes the values of E(D|TV) in one depth-first traversing of T.
Algorithm 1 Computing quadratic entropy for a given probability distribution. Procedure compute_e(v,d)
Parameter v: vertex for which we are computing E(D|T>). Parameter d: distance from v to the leaves of Tv.
14
15
if v is a leaf then set e(v) = 0.
let p(v) be the assigned probability P(l = v). else
let c\,... ,ct be the children of v.
compute_e(ci,d — d(vci)).
set e(v) = e(c1).
set p(v) = p(c1)
for i = 2 to t do
compute_e(Q,d — d(vci)). set p(v) = p(v) + p(c^. set p = p(ci)/p(v).
set e(v) = e(ci)p2 + e(v)(1 — p)2 + 4dp(1 — p). end for end if
For the proof of correctness in this and the following sections, we introduce some notation. Let v E V(T) be some vertex of T, and let c1,... ,ct be its children. In this context, let Ti be the tree, rooted at v, obtained recursively as T1 := Tc1 U vc1 and Ti = Ti-1 + (Tc. U vci) for i > 2, where Tci U vci is the tree obtained from Tci by adding the edge civ. Note that E(D|TC U vci) and p(TCi U vci) are the same as E(D|TC) and p(TCi), respectively.
Theorem 2 For v E V(T), the value e(v) computed by Algorithm 1 equals E(D|T>). In particular, e(r) is the quadratic entropy of T for a given probability distribution p.
U
a
CD U
Proof. In addition to the statement of the theorem, we claim that p(v) = P(l E Tv). We prove these claims by induction on the number of vertices in Tv. If there are only two vertices
r = u and v, which is the leaf, then ^(v) = 1 (line 3), e(v) = 0 (line 2), ^r(u) = 1 (line 8),
^ and e(u) = 0 (lines 6 and 7), which is correct.
Let there be at least three vertices in Tv. Correct results are computed for the children ci of Vi in lines 6 and 9 by induction. If there is only one child, then ^(q) = ^(T>) by induction, and lines 8 and 7 establish correctness of ^(v) and e(v).
If there are more children, then line 11 computes the probability ^(Ti) and line 12 computes P(l E Tci\l E Ti) by conditional probabilities. Line 13 computes E(D\7i) by Proposition 1. Then e(v) = E(D\T) and ^(v) = ^(Tv) after the execution of the for loop, since T = Tv. The theorem follows.	□
U
CD
a
CD
CO
o
vo
4 Quadratic Entropy Versus Weighted Wiener Index
CO
In this section, we show that there is a close connection between the quadratic entropy and one of the central concepts studied in chemical graph theory. Extending the methods
i
2from this field of research we give an alternative algorithm (Corollary 4) for computing the
CO
quadratic entropy of T. This algorithm avoids computing the distances between the leaves.
CM
In mathematical chemistry, numerous graph invariants are used to analyze and predict physical and chemical properties of chemical compounds. When such invariants are computed
on chemical graphs they are traditionally called topological indices. Among topological indices, the Wiener index is the oldest [39] and one of the most thoroughly studied indices, cf. the surveys [6, 7]. Let G = (V(G),E(G)) be a connected graph. Then the Wiener index W(G) of G is defined as the sum of the shortest path distances between all unordered pairs of vertices:
1
CD
W(G)= £ d(u,v) = 2	d(u,v).
{u,v}Ç V(G)	u,vev (G)
This classical definition was extended in [18] to weighted graphs (G, f), where f : V(G) ^ R
U
a
CD U
O is a vertex weighting function, in the following way:
CM
2	W(G,f ) = 2 £ f (u)f (v)d(u,v).
«,»ev (G)
J-l CD
Note that in the definitions of the (weighted) Wiener index it is assumed that all the edges
s
^ have unit length, that is, the w-values on its edges are all 1.
a
In order to design a linear algorithm for computing the Wiener index of an important
CD CO
class of chemical graphs—benzenoid systems—it was observed in [5] that the Wiener index of a weighted tree can be computed in linear time. (This result in also implicit in [21].) We now
vO
show that the approach can be extended to weighted trees with edges of arbitrary length. The weighted Wiener index W(G, f) is defined as before, except that now d(u, v) is the sum
o
£ of the w-values on a shortest u, v-path.
Let (T, f) be a weighted tree and let uv be an edge of T. Then T — uv consists of connected components, say Tu and Tv, where u E Tu and v E Tv. Let f (Tu) =	f (x)
and f (Tv) = ExeTv f (x).
CM
CO CD
5H
CD CO
Proposition 3 Let (T, f ) be a weighted tree with w-values on its edges. Then
W(T,f )= £ f (Tu)f (Tv)w(u,v).
Proof. Let uv be an arbitrary edge of T and let x, y E V(T). Then e lies on the u, v-path if and only if one of x, y belongs to Tu and the other to Tv. Suppose that this is the case and let x = xi,..., Xj = u, Xj+i = v,..., xk = y be the x, y-path in T. Then
k-1
f (u)f (v)d(u, v) = f (u)f (v) w(x^, xi+1) .
i=1
Hence the contribution of the edge uv to W(G, f) with respect to the unordered pair x,y is
f (x)f (y)w(u,v). Since this holds for all pairs of vertices from Tu and Tv, the result follows.
a
CD □
Corollary 4 Let (T, r, w) be an evolutionary tree with probability distribution ^ on its leaves.
Then
vo	^
E(D) = 2 • £ w(ev)»(TV)(1 - )).
ev €E(T)
CD
^ Proof. Define f : V(T) ^ R with
§	r
I a(u); u is a leaf of T,
&	f (u) = r
I 0; otherwise.
o
^ Then note that E(D) = 2W(T, f) and apply Proposition 3.	□
Special cases of vertex-weighted Wiener indices (and Wiener) polynomial were recently treated in [19, 8], where the assigned weights are vertex degrees. The general case, in which both vertices and edges are weighted, has been to the best of our knowledge treated earlier only by Zmazek and Zerovnik [42]. They give a linear algorithm for cactus graphs, the graphs CM whose blocks are cycles and edges. Hence their (rather involved) algorithm can be considered as an extension of the algorithm that flows from Corollary 4.
^ 5 Maximizing Quadratic Entropy CO
In this section, we present an algorithm that computes the maximum value of quadratic entropy over all possible probability distributions on the leaves of T, together with the probability distribution ^ on the leaves of T that achieves the maximium value. In terms of Wiener index, this problem finds the weighting function on the set of leaves of T, such that the resulting weighted Wiener index is maximum. This weighting is Pavoine's originality
• i
score [25] from conservation biology. To our knowledge, this problem has not been studied
CO
earlier in the Wiener index framework.
Proposition 5 Let T and T2 be two trees of height h with probability distributions
a
and expected distances E(Di), E(D2). Further, let T = T + T2 be their join. Then the
O distribution p, defined with
CM
( 2h-E(D2 ) „ (]) . ](-T y) = ) 4h-E(Di)-E(D2) pi(]) . ]
[ 4h-Ii(Di)—IE(—2) P2(]) ; ] G 72
maximizes E(D) whenever pi and p2 maximize E(Di) and E(D2).
CD
Proof. First note that p(T) = 1 and that p(7i) is obtained by scaling p^, i = 1, 2, therefore E(D|7i) = E(Di). By p(7i) + p(72) = 1 and Proposition 1, we have
E(D) = E(D|Ti)p(7i)2 + E(D|72)(1 - p(7i))2 + 4hp(Ti)(1 - p(7i)).
This expression involves the constant h and three variables, p(T1), E(D|T1) and E(D|T2),
H
which are not independent. We optimize E(D) under the assumption of their independence, C^ which we justify later.
For fixed E(D|T1), E(D|T2), and variable p(T1), E(D) is maximized in the apex of the
i
CM CO
parabola, thus
2h - E(D|72)
p(Ti) 4h - E(D|Ti) - E(D|72)
£ CO
p(72)
implying
2h - E(D|Ti)
4h - E(D|Ti) - E(D|72)' Using these values, we obtain the maximum
E(D) 4h2 - E(D|Ti)E(D|T2)
4h - E(D|Ti) - E(D|T2)'
CO
.¡H
The partial derivative in variables E(D|T1) and E(D|T2) is everywhere nonnegative, thus
CD
E(D) will be maximized when both E(D|T) and E(D|T2) will be largest. By assumption, this is achieved if p restricted to T equals p1 and p restricted to T2 equals p2.
• i
vo
u
CD
CD
a
CD CO
o
vD
0
o
CM
1
CM CO CM CM
£ CO CO
The distribution ^ described by formula (5) satisfies all three conditions: ^(71) and ^(72) have the desired value and ^ restricted to 7 is after normalization equal to i = 1, 2. Since the distribution ^ satisfies the optimality conditions for the optimum without dependence of the three variables, it achieves the optimum in the restricted case and therefore maximizes E(D).	□
Algorithm 2 First pass for computing relative subtree probabilities. Procedure maximize_e(v,d)
Parameter v: vertex for which computation is done. Parameter d: distance from v to the leaves of Tv. 1: set ^r (v) = 1. 2: if v is a leaf then 3: set e(v) = 0. 4: else
5: let ci,..., ct be the children of v.
6
7
8 9
10 11 12
13
14
15
16
17
18 19
maximize_e(ci,d — d(vc^). set e(v) = e(c1). for i = 2 to t do
maximize^c^d — d(v^)). set // (c•) =_2d-s(v)_
set e(v) = e(ci)^r(q)2 + e(v)(1 — (q))2 + 4d^r(q)(1 — (q)). end for
set x = 1 — (ct). for i = t — 1 downto 1 do set y =1 — (ci). set (ci) = (ci)x. set x = xy. end for end if
CO CD
Jh
CD CO
Jh
a
CD U
We use Propositions 1 and 5 recursively in Algorithm 4, which computes ^ in two passes of depth-first traversing of T. In the first pass, Algorithm 2, we compute e(v) = E(D|TV) for every vertex v of T and (v) = P(1 G TV|1 G Tu), i.e. the probability for a leaf selected from TU to lie in Tv, where u is the parent of v. In the second pass, Algorithm 3, we compute absolute probabilities ^(v) = ^(T>) for a leaf to be selected in Tv. A child of some vertex tree is considered at most twice in Algorithm 2 and once in Algorithm 3, thus the algorithm is linear in the number of vertices of the tree. The following Theorem establishes its correctness.
Algorithm 3 Second pass for computing absolute subtree probabilities.
CM Procedure compute _m(v,t)
Parameter v: the vertex for which the computation is done. Parameter t: the value m(TU) for u the parent of v. 1: set m(v) = Mr(v)t. 2: for each child c of v do 3: compute_^(c, m(v)). 4: end for
U
CD
CD
a
Algorithm 4 Recursive calls maximizing the quadratic entropy.
1: maximize_e_Mr(r, h). 2: compute_^(r,1).
o
Theorem 6 The probability distribution m computed by Algorithm 4 maximizes the quadratic entropy of the evolutionary tree T. The value e(r) stores the maximum quadratic entropy.
o
Proof. First we prove by induction on the number of vertices in Tv that Algorithm 2 correctly computes e(v) = E(D|Tv) and Mr(v) = P(/ G Tv|1 G Tu), u being the parent of v. If
&
there are only two vertices r = u and v, which is the leaf, then Mr (v) = 1 (line 1), e(v) = 0
CM
(line 3), Mr(u) = 1 (line 1), and e(u) = e(v) = 0 (lines 6 and 7), which is correct.
CM
Let there be at least three vertices in Tv. By induction, each call in lines 6 and 9 computes
CM
correct information for Tc., where c is some child of v. It is easy to see that the same values fc apply to the _t wt^ m the tree ^ U .coted at t, If there ls only one daM, then
CO lines 1 and 7 assure correctness of m(v) and e(v).
I-H
If there are more children, then line 10 computes the correct value Mr (q) relative to the tree T by Proposition 5, and line 11 correctly computes E(D|Ti) by Proposition 1. Thus Mr(ct) and e(v) are computed correctly in the first loop. For other children of v, at each join evaluated in lines 10 and 11, we would by Proposition 5 need to update Mr(cj), 1 < j < i, by
CD
a factor of 1 — Mr(q), where the latter is the value computed in line 10. This is done in line 17: as the value y accumulates the updating factor, only one visit to each child is necessary
for update. The correctness of Algorithm 2 follows.
U
a
CD U
By conditional probabilities,
vO
P(l G Tv) = P(l G TV A l G Tu) = P(l G T„)P(l G Tv\l G T„).
CD
Thus, ^(v) = ^(u)^r(v), which via line 1 of Algorithm 3 establishes correctness of Algorithm
g
3. We correctly set ^(r) = 1 in line 2 of Algorithm 4. Since ^(v) = ^(Tv) for any leaf v of
p p T, we conclude the proof.	□
CD
O
6 Concluding remarks
We present several new insights into quadratic entropy of ultrametric evolutionary trees,
o
drawn from computational chemistry and graph theory. First, we propose a recursive decomposition of evolutionary trees and derive a formula to express quadratic entropy of the whole tree as a function of quadratic entropies of the subtrees in the decomposition (Proposition
3 ...............
00
community biodiversity index that incorporates evolutionary history and community struc-
CM
ture [28]. We use Proposition 1 to design an efficient linear time algorithm (Algorithm 1) that
£ CO CO
Second, we observe that quadratic entropy of an evolutionary tree is a variant of weighted
computes the quadratic entropy of such trees and can handle large communities (cf. [22]).
Wiener index, a general graph invariant widely used in mathematical chemistry. This link may establish an exchange of ideas between the areas, as demonstrated by the statements of Section 4, where we expose some theoretical properties of quadratic entropy — Wiener index. Finally, maximizing the quadratic entropy offers a novel pairwise originality metric
CD
[25] whose properties still remain relatively unexplored (but see [26]). We derive a linear
CD
time algorithm for its maximization on ultrametric evolutionary trees, Algorithm 4, that
supersedes existing algorithms, and may help in exploration of this quantity.
U
a
CD U
o
vO
O
O References
[1]	Analyses des Données Ecologiques: méthodes Exploratoires et Euclidiennes en sciences de l'Environnement, Centre National De La Recherche Scientifique et Universite Claude
CD
Bernard Lyon 1, Lyon, 2007. http://pbil.univ-lyon1.fr/ADE-4.
a
[2]	M. Baroni, C. Semple, M. Steel, A framework for representing reticulate evolution, Ann. Comb. 8 (2004), 391-408.
CO
[3]	D. Bokal, code in preparation.
[4]	S. Champely, S. Pavoine, divcmax: Maximal value of Rao's diversity coefficient also called quadratic entropy. in: D. Chessel, A. B. Dufour, S. Dray, The ade4 Package, Analysis of Environmental Data: Exploratory and Euclidean methods in Environmental sciences, Universite Claude Bernard Lyon 1, 2006. http://microarrays.unife.it/CRAN/doc/packages/ade4.pdf.
CM
[5]	V. Chepoi, S. KlavZar, The Wiener index and the Szeged index of benzenoid systems in linear time, J. Chem. Inf. Comput. Sci. 37 (1997), 752-755.
CM
Z	..........................................
tions, Acta Appl. Math. 66 (2001), 211-249.
CO
l-H	~
[7]	A. A. Dobrynin, I. Gutman, S. KlavZar, P. Zigert, Wiener index of hexagonal systems, Acta Appl. Math. 72 (2002), 247-294.
PU
[8]	T. Doslic, Vertex-weighted Wiener polynomials for composite graphs, Ars Math. Con-temp. 1 (2008) 66-80.
CD
[9]	A. W. F. Edwards, Estimation of branch points of a branching diffusions process (with
CO
discussion), J. R. Stat. Soc. B 32 (1970), 155-174.
£
[10] D. P. Faith, Conservation evaluation and phylogenetic diversity, Biological Conservation
a
61 (1992), 1-10.
[11]	J. Felsenstein, Inferring Phylogenies, Sinauer Associates, Sunderland, Massachusetts,
2004.
[12]	C.-J. Haake, A. Kashiwada, F. E. Su, The Shapley value of phylogenetic trees, IMW Working Paper 363, (2005).
[13]	K. Hartmann, in preparation.
U
CD
CD
a
CD
CO
[14]	K. Hartmann, M. Steel, Maximising phylogenetic diversity in biodiversity conservation: greedy solutions to the Noah's Ark problem, Systematic Biology 55 (2006), 644-651.
o
[15]	K. Hartmann, M. Steel, Phylogenetic diversity: from combinatorics to ecology, in O. Gas-cuel, M. Steel, eds., Reconstructing evolution: New mathematical and computational approaches, Oxford University Press, Oxford, 2007.
[16] J. Hey, Using phylogenetic trees to study speciation and extinction, Evolution 46 (1992), 627-640.
[17] N. J. B. Isaac, S. T. Truvery, B. Colen, C. Waterman, J. E. M. Baillie, Mammals on the
CM
edge: conservation priorities based on threat and phylogeny, PLoS One 2(3): e296.
CM
[18]	S. Klavzar, I. Gutman, Wiener number of vertex-weighted graphs and a chemical application, Discrete Appl. Math. 80 (1997), 73-81.
CO
l-H
[19]	D. J. Klein, T. Doslic, D. Bonchev, Vertex-weightings for distance moments and thorny graphs, Discrete Appl. Math. 155 (2007) 2294-2302.

[20] R. M. May, Taxonomy as Destiny, Nature 347 (1990), 129-130.
[21] B. Mohar, T. Pisanski, How to compute the Wiener index of a graph, J. Math. Chem.
U
CD
m
2 (1988), 267-277.
[22] M. Vellend, W. Cornwell, K. Magnuson-Ford, A. 0. Mooers. Measuring phyloge-
netic biodiversity. In Biological diversity: frontiers in measurement and assessment.
a
(A. Magurran & B. McGill, eds.) Oxford University Press. In prep.
[23] A. 0. Mooers, L. J. Harmon, M. G. B.Blum, D. H. J. Wong, S. B. Heard, Some models of phylogenetic tree shape, in O. Gascuel, M. Steel, eds., Reconstructing evolution: New
vO
u
CD
CD
S CO CO
u
a
CD u
mathematical and computational advances, Oxford University Press, Oxford (2007), 149-170.
[24] S. Nee, R. M. May, P. H. Harvey, The reconstructed evolutionary process, Philosophical Transactions of the Royal Society Series B-Biological Sciences 344 (1994), 305-311.
CD
[25]	S. Pavoine, S. Ollier, A. B. Dufour, Is the originality of a species measurable?, Ecology Letters 8 (2005), 579-586.
vO
[26]	S. Pavoine, S. Ollier, D. Pontier, Measuring diversity from dissimilarities with Rao's quadratic entropy: Are any dissimilarities suitable?, Theoretical Population Biology 67 (2005), 231-239.
[27]	D. L. Rabosky, Likelihood methods for detecting temporal shifts in diversification rates, Evolution 60 (2006), 1152-1164.
[28]	C. R. Rao, Diversity and dissimilarity coefficients: a unified approach, Theoretical Population Biology 21 (1982), 24-43.
[29]	D. W. Redding, Incorporating Genetic Distinctness And Reserve Occupancy In A Conservation Priorisation Approach, Masters Thesis, University Of East Anglia, Norwich, 2003.
[30]	D. W. Redding, A. Mimoto, D. Bokal, K. Hartmann, M. DeVos, A. 0. Mooers, The most "original" species often capture more phylogenetic diversity than expected, J. Theor.
CD
Biology 251 (2008), 606-615.
CD
[31] D. W. Redding, A. 0. Mooers, Incorporating evolutionary measures into conservation prioritization, Conservation Biology 20 (2006), 1670-1678.
[32]	D. E. Rosen, Vicariant patterns and historical explanation in biogeography, Systematic
Zoology 27 (1978), 159-188.
vO
[33]	C. Semple, Reconstructing minimal rooted trees, Discrete Appl. Math. 127 (2003), 489$ 503.
S3
[34]	C. Semple, M. Steel, A supertree method for rooted trees, Discrete Appl. Math. 105 (2000), 147-158.
CO
[35]	C. Semple, M. Steel, Phylogenetics, Oxford University Press, Oxford, 2003.
o
vO
[36]	M. Steel, A. McKenzie, Properties of phylogenetic trees generated by Yule-type specia-tion models, Mathematical Biosciences 170 (2001), 91-112.
0
[37]	The R Project for Statistical Computing, Department of Statistics and Mathematics, WU Wien, 2007. http://www.r-project.org/.
CD
[38]	M. L. Weitzman, The Noah's Ark Problem, Econometrica 66 (1998), 1279-1298.
1
CM
[39]	H. Wiener, Structural determination of paraffin boiling points, J. Amer. Chem. Soc. 69
CM
(1947), 17-20.
£
[40]	Z. Yang, B. Rannala, Branch-length prior influences Bayesian posterior probability of
CO
phylogeny, Systematic Biology 54 (2005), 455-470.
[41]	G. U. Yule, A mathematical theory of evolution based on the conclusions of Dr J. C. Willis, Philosophical Transactions of the Royal Society (London) Series B—
l-H
Biological Sciences 213 (1924), 21-87.
•in
[42]	B. Zmazek, J. Zerovnik, Computing the weighted Wiener and Szeged number on
CD
weighted cactus graphs in linear time, Croat. Chem. Acta 76 (2003), 137-143.