Incorporating phenotype and genotype in model-based species delimitation

 

caption caption

Figure by Jeremy Yoder showing gene tree species tree discordance. This phenomenon complicates species delimitation efforts using genetic data.


Species are the fundamental unit of biology but identifying them is a challenging task that receives a lot of theoretical and empirical attention. In a recent Evolution paper, Solís‐Lemus et al. (2015) introduce a new model-based method that integrates phenotypic and genetic data in the delimitation of species boundaries. The method also accommodates divergence with gene flow and selectively driven divergence.

The goal of our work is to develop a species delimitation method to combine genetic and trait data into a common framework based on an explicit model of evolution. Specifically, we extend the Bayesian program BPP (Bayesian phylogenetics and phylogeography, Yang and Rannala 2010) to combine genetic and quantitative trait data in a single Bayesian framework, which we call iBPP (integrated BPP).


As in BPP, in iBPP the user provides a guide tree assigning individuals to putative species and a set of gene trees. One or more nodes in the guide tree are collapsed to generate alternative hypotheses of species delimitation. The posterior probability for particular hypothesis is calculated from the gene trees based on the multi-species coalescent. iBPP also uses independent quantitative continuous traits to evaluate the posterior probability of a species delimitation hypothesis.

Each [phenotypic] trait is assumed to have a normal distribution, with species means governed by a Brownian motion (BM) process along the species tree and individuals normally distributed around the species means. A parameter λ models the between-to-within species variance ratio…

The authors used empirical data and data simulated under several scenarios to test their new method. Here is brief review of some of their (extensive) findings:

  • It was possible to delimit species with a model-based analysis of phenotypic data using only one, two, or three traits (see Figure 3 in the paper). In some instances, phenotypic traits provided more information than genetic data
  • Accuracy of species delimitation improved when both data types were used (see curves with solid black dots in Figure 5 in the paper and reproduced here below)
  • Under scenarios including gene flow, the posterior probability of the true species delimitation hypothesis decreased with increasing migration when only using genetic data. Including phenotypic data from traits under selection increased the posterior probability for the true model, even when migration was high (see top right panel of Figure 5 in the paper and reproduced here below)

Mean posterior probability (PP) of the tree with the three true species with increasing migration rates. Each point is the mean over 100 replicates for each scenario. Lines show SEs. Simulations included five individuals per putative species, a total tree height of 1 coalescent units, θ = 0.001, and 600 bp loci.

Mean posterior probability (PP) of the tree with the three true species with increasing migration rates. Each point is the mean over 100 replicates for each scenario. Lines show SEs. Simulations included five individuals per putative species, a total tree height of 1 coalescent units, θ = 0.001, and 600 bp loci. Figure and caption from Solis-Lemus et al. 2015


There are still a lot of kinks in the species delimitation process (for example, assigning individuals to putative species, an upstream requirement for programs like BPP and iBPP that is prone to error and currently has no standard method) but the integration of phenotypic and genetic data (and perhaps soon geographic data? See “Future Developments For iBPP (Or Other Model-Based Integrative Approaches)” in Solís‐Lemus et al.) is a step in the right direction towards a holistic species delimitation framework.
The iBPP program and its companion simulation program are available open-source at https://github.com/cecileane/iBPP/
Solís‐Lemus, C., Knowles, L. L., & Ané, C. (2015). Bayesian species delimitation combining multiple genes and traits in a unified framework. Evolution. DOI: 10.1111/evo.12582
Yang, Z., & Rannala, B. (2010). Bayesian species delimitation using multilocus sequence data. Proceedings of the National Academy of Sciences, 107, 9264-9269. DOI: 10.1073/pnas.0913022107

This entry was posted in methods, speciation, species delimitation, theory. Bookmark the permalink.