Abstract
Most existing measures of distance between phylogenetic trees are based on the geometry or topology of the trees. Instead, we consider distance measures which are based on the underlying probability distributions on genetic sequence data induced by trees. Monte Carlo schemes are necessary to calculate these distances approximately, and we describe efficient sampling procedures. Key features of the distances are the ability to include substitution model parameters and to handle trees with different taxon sets in a principled way. We demonstrate some of the properties of these new distance measures and compare them to existing distances, in particular by applying multidimensional scaling to data sets previously reported as containing phylogenetic islands. [Metric; probability distribution; multidimensional scaling; information geometry.
Original language | English |
---|---|
Pages (from-to) | 320-327 |
Number of pages | 8 |
Journal | Systematic Biology |
Volume | 67 |
Issue number | 2 |
Early online date | 4 Oct 2017 |
DOIs | |
Publication status | Published - 1 Mar 2018 |
Externally published | Yes |