Skip to main navigation Skip to search Skip to main content

Metric learning for phylogenetic invariants

Research output: Working paperPreprint

Abstract

We introduce new methods for phylogenetic tree quartet construction by using machine learning to optimize the power of phylogenetic invariants. Phylogenetic invariants are polynomials in the joint probabilities which vanish under a model of evolution on a phylogenetic tree. We give algorithms for selecting a good set of invariants and for learning a metric on this set of invariants which optimally distinguishes the different models. Our learning algorithms involve linear and semidefinite programming on data simulated over a wide range of parameters. We provide extensive tests of the learned metrics on simulated data from phylogenetic trees with four leaves under the Jukes-Cantor and Kimura 3-parameter models of DNA evolution. Our method greatly improves on other uses of invariants and is competitive with or better than neighbor-joining. In particular, we obtain metrics trained on trees with short internal branches which perform much better than neighbor joining on this region of parameter space.
Original languageEnglish
Publication statusPublished - 2015

Publication series

NamearXiv

Fingerprint

Dive into the research topics of 'Metric learning for phylogenetic invariants'. Together they form a unique fingerprint.

Cite this