Continuing our discussion of the neutralist-selectionist debate, recent findings by Schrider et al. (2015) bring us to the topic of selective sweeps, and their genomic signatures in a population. As we have discussed in previous posts, numerous studies (since the proposal of the neutral theory – Kimura 1968) have shown evidentially, the fixation of beneficial mutations due to positive selection, and their roles in adaptive evolution. While there are several proposed mechanisms driving positively selected alleles to fixation (see my previous post here for some thoughts on the effects of recombination in adaptive evolution), a very plausible (and increasing in evidence by the day) mechanism is one of selective sweeps, or the quick rise to fixation of a beneficial allele in a population (due to positive selection), and the subsequent depletion of linked neutral diversity around the allele (due to genetic hitchhiking). Classified into hard (initial frequency of the beneficial allele = 1/2N), soft (initial frequency > 1/2N due to presence of the allele near neutrality in the population until some perturbation, often environmental, that sets off the sweep), and partial (or incomplete, wherein the beneficial allele has yet to reach fixation in a population) classes, the detection of sweeps has been used extensively in recent years to describe signatures of selection across the genome.
Signatures of selection can be described using several summary statistics, including polymorphism levels, site-specific diversity, haplotype diversity, Tajima’s D, LD-based statistics, etc. Schrider et al. (2015) discuss via simulations, the efficacy of summary statistics in quantifying selective sweeps. In short, all summary statistics rely on (a) the depletion of genomic diversity around a selected site (eg. see Figure 2 from Maynard-Smith and Haigh 1974 above), and (b) haplotypic diversity – recent hard sweeps should produce one “fixed” haplotype around the selected site in high frequencies, versus soft/incomplete sweeps which should result in multiple haplotypes in intermediate frequencies around the selected site. But through the course of recombination between the selected allele, and a neutral allele, a not so recent hard sweep can yet produce multiple haplotypes of intermediate frequencies. Methods to detect sweeps would thus wrongly classify these as soft or partial sweeps, a phenomenon the authors term the “soft shoulder” effect.
To describe this effect, the authors perform coalescent simulations under different scenarios of sweeps, by varying (a) the initial frequency of the sweeping allele, (b) time(s) of sweeps, and (c) the selection coefficients. Analyses of several summary statistics indicate unanimous support for the “soft shoulder effect”, with numerous false positives for the presence of soft/partial sweeps in sites linked to hard sweeping alleles.
The authors thus recommend interpreting studies that perform genome-wide scans for the detection of positively selected sites (and sweeps) with care, and propose several suggestions:
- Analysis of flanking regions to detect selection (and sweeps), rather than just analysis of immediately surrounding the selected site.
- Applying methods that account for polymorphism, allele frequency, haplotype diversity, and LD based statistics,
- accounting for gene conversion rates,
- and importantly, checking for evidence of a nearby hard sweep, whenever a soft/partial sweep is found, to rule out the “shoulder effect”.
Schrider, Daniel R., et al. “Soft shoulders ahead: spurious signatures of soft and partial selective sweeps result from linked hard sweeps.” Genetics (2015): genetics-115. http://dx.doi.org/10.1534/genetics.115.174912
Maynard Smith, J., and J. Haigh, 1974 The hitch-hiking effect of a favourable gene. Genet. Res. 23: 23-35.
Kimura, Motoo. “Evolutionary rate at the molecular level.” Nature 217.5129 (1968): 624-626.