A couple of weeks ago I wrote about a new method to incorporate morphology and DNA sequences into species delimitation. Including both data types improved the results but a couple of tricky spots remained: 1) correctly assigning individuals to putative species and 2) estimating an accurate guide tree. Two recent papers have developed ways to help us break free from the guide tree and move towards a more objective, robust species delimitation.
Jones et al. 2014 present their method DISSECT (Division of Individuals into Species using Sequences and Epsilon-Collapsed Trees) which does not require the a priori assignment of individuals to clusters/species but instead explores all possible clusterings and species tree topologies.
The basic idea behind DISSECT is to sample trees in which each tip represents a single individual (or a cluster of individuals which definitely belong in one species), but replace the usual prior density on node heights with one which includes a spike near zero. The dimensionality of the parameter space is fixed, but nodes whose heights have a high posterior probability of being within the spike can be interpreted as ‘probably collapsed’
The analysis can be run in BEAST (version 1.8.1 and later) and Jones et al. provide a nice section describing the workflow of the analysis and some advice on how to set the parameters and priors.
The second paper, Yang and Ranala (2014) present an updated version of their program BPP which eliminates the user-defined guide tree in species delimitation and incorporates phylogenetic uncertainty of the gene trees in a Bayesian framework.
A novel MCMC proposal based on the nearest-neighbor interchange (NNI) algorithm for rooted trees is developed here to change the species tree topology, eliminating the need for a user-specified guide tree. The gene trees for multiple loci are altered in the proposal to avoid conflicts with the newly proposed species tree.
One potential drawback to the method is that it may not be practical for a large number of populations- computation time increases much more quickly with an increase in the number of populations than with an increase in the number of sequences per locus. Nevertheless, the introductions of DISSECT and new version of BPP are exciting steps forward in species delimitation and I am excited to see them tested on empirical systems by other researchers.
Jones, G., Aydin, Z., & Oxelman, B. (2014). DISSECT: an assignment-free Bayesian discovery method for species delimitation under the multispecies coalescent. Bioinformatics, btu770. DOI: 10.1093/bioinformatics/btu770
Yang, Z., & Rannala, B. (2014). Unguided species delimitation using DNA sequence data from multiple loci. Molecular Biology and Evolution, 31 (12):3125-3135. DOI: 10.1093/molbev/msu279