Sweeps and Demographic Inference

Population genetics presents us with numerous conundrums – several of which have to do with how the same genomic disposition can be “reached” over evolutionary time with multiple alternate demographic or selective processes. I have discussed several of these issues before (here and here), wherein demography confounds selection or vice versa. Studies that estimate genetic diversity, differentiation, and/or effective population sizes thus need to pay attention to the effects of linked selection, and sweeps before jumping to conclusions about their underlying evolutionary history. Schrider et al. (2016) in a new manuscript discuss the confounding effects of sweeps in the inference of effective population sizes using three popular evolutionary model-based inference platforms – ABC, δaδi, and PSMC.

Briefly, using coalescent simulations of 500 unlinked loci, and 100 replicate genomes under each of four population histories – constant size, bottleneck, exponential growth, and bottleneck followed by exponential growth, they determine the efficiency of genetic diversity (π), Tajima’s D, and the three methods above in recapturing the effects of linked selective sweeps of varying intensities on sites with increasing genetic distance. For inference using PSMC, the authors simulated 100 replicates of 15 Mb genomes under four scenarios – neutral, one recent sweep, three recurrent sweeps, and one of five sweeps.

Inference of effective population size change using PSMC under different scenarios of recurrent sweeps. Image courtesy: Figure 5 of Schrider et al. (2016)

Inference of effective population size change using PSMC under different scenarios of recurrent sweeps. Image courtesy: Figure 5 of Schrider et al. (2016)

Under the bottleneck model, increasing the number of loci under sweeps upwardly biased parameter estimates of effective population sizes using both δaδi, and ABC. Similarly, the population growth model simulations showed bias towards more recent and faster growth rates using both methods. Inferences were differently biased under both methods in the contraction followed by growth model as well. Inference using PSMC indicated that sweeps can influence population size change estimates considerably, depending on the number of recurrent sweeps over evolutionary time, with increased variance in estimates with increased number of sweeps, thus “dramatically skew”-ing estimates. Note however, that this is exactly what one would expect to see while using PSMC in the presence of sweeps – selective sweeps cause drastic reductions in effective population sizes, which can confound true bottlenecks (see this interesting Twitter conversation over this debate).

Rightfully so, Schrider et al. (2016) caution scientists about the challenges in “simultaneous estimation of parameters related to natural selection and demographic history”.

Until an approach to obtain accurate estimates of demographic parameters in the face of natural selection is devised, population size histories inferred from population genetic datasets could remain significantly biased.

Reference:

Schrider, Daniel, Alexander G. Shanku, and Andrew D. Kern. “Effects of linked selective sweeps on demographic inference and model selection.”bioRxiv (2016): 047019. DOI: http://dx.doi.org/10.1101/047019

Share

About Arun Sethuraman

I am a computational biologist, and I build statistical models and tools for population genetics. I am particularly interested in studying the dynamics of structured populations, genetic admixture, and ancestral demography.
This entry was posted in bioinformatics, evolution, genomics, population genetics, selection, theory and tagged , , , . Bookmark the permalink.