Distributions on bicoloured evolutionary trees : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Mathematics at Massey University

Steel, Michael Anthony

Distributions on bicoloured evolutionary trees : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Mathematics at Massey University

Files

02_whole.pdf (2.93 MB)

01_front.pdf (682.94 KB)

Date

1989

Authors

Steel, Michael Anthony

Publisher

Massey University

Rights

The Author

Abstract

A central and challenging problem in contemporary biology is how to accurately reconstruct evolutionary trees from DNA sequence data. This thesis addresses three themes from this endeavour -- comparison, consistency and confidence intervals -- by analysing distributions arising from phylogenetic trees. Toward the first theme, the distribution of the symmetric difference metric on pairs of binary and phylogenetic trees is studied, and a number of new results obtained. These theorems, as well as a result on another tree metric answer previous conjectures in this area. Also under the theme of comparison, we analyse distributions on bicoloured trees arising from the principle of parsimony. A streamlined proof is given of an elegant theorem which allows an efficient comparison of how much better a maximum parsimony tree fits given data than a randomly-chosen tree. A dual distribution, where the tree is fixed and the data varies is also analysed, answering a recent unsolved problem. We then consider the theoretical accuracy of tree-building methods, concentrating on the statistical property of consistency. Under a simple stochastic model on bicoloured trees, conditions for the consistency of frequently-used methods based on parsimony and compatibility are examined. lt is shown that even in "best possible" conditions both methods can be inconsistent, though a strong sufficient condition for compatibility is given. The analysis is extended for a molecular clock. Finally, procedures are described for placing confidence intervals around phylogenies, and limitations on the sort of confidence intervals possible are given. Ways to efficiently implement these procedures are then considered -- in particular, approximate methods, applications to sets of taxa of size four, and simplifications under a molecular clock. The rate that sequence data must grow as a function of the number of taxa for confidence intervals to converge to a single tree is also considered. The arguments in this thesis are primarily combinatorial and stochastic. In the hope that their implications will also interest biologists, some space has been given to motivating and explaining the biological relevance of the results presented.

Keywords

Evolutionary trees, Phylogenetic trees, Phylogenetics

URI

http://hdl.handle.net/10179/4216

Collections

Theses and Dissertations

Full item page

Distributions on bicoloured evolutionary trees : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Mathematics at Massey University

Files

Date

DOI

Open Access Location

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Rights

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By