Speciation with gene flow and the virtual beanbag: Genome-level effects increase divergence during ecological speciation, but linkage is not required

This post is a guest contribution by Dylan Goldade, Kathryn Theiss, and Chris Smith, from the Biology Department at Willamette University. See below for the coauthors’ afflilations and research interests.

In a famous address given on the hundredth anniversary of the publication of The Origin of Species, Ernst Mayr (Figure 1) critiqued the work of early 20th-Century population geneticists, including Fisher and Haldane, as ‘beanbag genetics’ (Provine 2004). Mathematical models of natural selection that consider only a single locus in isolation, Mayr argued, overlook the importance of epistatic interactions between loci. Very strong interactions between loci in the form of ‘coadapted gene complexes’, Mayr later argued, might play a major role in peripatric speciation; ‘genetic revolutions’ following population bottlenecks might allow small populations to move between two genetic equilibria.

Figure 1: Ernst Mayr in 1994

Ernst Mayr in 1994, after receiving an honorary degree at the University of Konstanz. Image from Meyer A. (2005).

Mayr foresaw that genome-level effects could be important in understanding the formation of new species. Although Mayr was almost monomaniacally focused on allopatric speciation, new simulation results produced by Sam Flaxman, Jeff Feder, and Patrik Nosil (Flaxman et al. 2013) suggest that ecological speciation is also more likely to occur in models that incorporate genome-level effects. However, contrary to the predictions made by a number of authors, linkage seems to play little role in facilitating speciation with gene flow.

Ecological speciation is the evolution of barriers to gene flow due to divergent natural selection mediated by the environment. Classic examples of ecological speciation include benthic and limnetic morphs of three-spine sticklebacks (Figure 2), apple maggot flies, and Timema walking sticks. In contrast to traditional models of speciation, such as allopatric and sympatric speciation, ecological speciation is agnostic with respect to the geography of speciation. However, in cases where there is not strict allopatry, ecological speciation describes a process by which divergent selection overcomes recurrent gene flow (Rundle & Nosil 2005).

Figure 2: Three spine stickleback. Photo via  wolfpix.

Figure 2: Three spine stickleback. Photo via wolfpix.

Models of speciation that involve ongoing gene flow remain controversial because gene flow is expected to homogenize differences between populations. However, genome-level effects may facilitate speciation with gene flow. For example, selection against immigrants may have the effect of reducing realized gene flow, even at loci that are not under divergent selection (Rundle & Nosil 2005). This global reduction in gene flow and increased divergence across the genome due to divergent selection is termed ‘Genome Hitchhiking’ (Feder et al. 2012). Genome hitchhiking may be enhanced by fitness epistasis – multiple loci interacting synergistically to cause reductions in fitness that are greater than selection acting on any one locus.

Additionally, a number of authors have suggested that physical linkage to a gene experiencing strong divergent natural selection may increase the probability that neutral and weakly selected alleles at closely linked loci may also fix. The effect of strong divergent selection at a given locus results in reduced effective gene flow at neighboring loci, thereby increasing the probability that these weakly selected, locally adapted alleles will also increase in frequency. Over time, these ‘islands’ of genomic divergence are predicted to slowly spread outward, as additional loosely linked loci are also fixed, launching a ‘cascade of divergence’ that spreads through the genome (Via 2012). This phenomenon is termed ‘Divergence Hitchhiking’

Flaxman et al. used a simulation approach to study evolution during speciation with gene flow and to examine the impact of genome-level effects on the progress of speciation. They considered the proportion of divergently selected mutants that establish, the reduction in effective gene flow, and changes in FST over time at each locus under three models: a direct selection only (DS) model, a model that includes genome hitchhiking (GH) but not linkage, and a divergence hitchhiking (DH) model that includes linkage (Figure 3).

Figure 3

Figure 3: A representation of the 3 models (Modified from Flaxman & al 2013, Figure 1). DS = Direct Selection, a beanbag genetics model in which selection acts independently at every locus. GH = Genomic Hitchhiking, in which selection acts on individuals whose fitness is determined by their multilocus genotype. There is no linkage between loci in this model. DH = Divergence Hitchhiking, in which loci are arranged into chromosomes, and alleles at different loci are inherited together at a rate proportional to their physical distance. m = migration rate per individual per generation. Note that we have represented different loci using letters, and subscripts to represent different alleles at each locus. All loci are codominant. Derived alleles/mutants are shown in red. Lines represent chromosomal (physical) linkage between loci.

The direct selection (DS) model was a true  ‘beanbag genetics’ model in the sense that Mayr intended the original epithet: evolution at each locus was independent of every other locus. In both of the other two models, genome-level effects occurred. In the genome hitchhiking (GH) model genes were packaged into individuals, so the fate of any particular gene copy in each generation was influenced by the cumulative fitness effects of all the other genes in that individual. The divergence hitchhiking (DH) model packaged loci into individuals, organized onto chromosomes, and thus allowed for linkage so that alleles physically linked to one another were more likely to be inherited together and to remain associated across generations.

The simulations considered two demes that exchanged migrants at some rate (m) per generation. The simulations began with the demes being genetically identical, and every individual being homozygous at every locus. (The authors also considered scenarios in which the two demes begin the simulation having already diverged at three different loci, each of which has a large effect on fitness). Over time mutations arose at different loci under an infinite sites model (that is, mutations could occur only once at each locus). Mutations could arise in either deme, but mutant alleles were favored by natural selection in deme 2, and disfavored in deme 1. The strength of selection acting at a given locus was allowed to vary; for some loci the strength of selection was very low, and these loci were therefore effectively neutral.

In the GH and DH models, the offspring of the previous generation ‘hatch’ in either deme one or deme two, and then can migrate to the other demes. The probability that an individual migrates is ‘m’- the per individual migration rate. Following migration each individual contributes gametes to the next generation; the probability that each individual contributes gametes was proportional to their relative fitness (Figure 4). Fitness is determined by the product of the fitness effects of the individual’s genotype at every locus (thus mutations at different loci affect fitness synergistically, and the model therefore assumes a kind of fitness epistasis). In the DS scenario, alleles are not organized into individuals, so the probability that an allele is passed to the next generation is determined solely by that allele’s weighted fitness. For purposes of migration in the DS scenario alleles move between demes as collections of alleles at different loci sampled at random from the population, but these ‘individuals’ effectively ‘dissolve’ back into the population beanbag once reaching the destination deme and before reproducing.

Figure 4

Figure 4: A hypothetical example of migration and selection acting in one generation of the simulation. Both demes undergo “hatching” of new offspring that replace their parents. These offspring may migrate to another deme (with a rate of m) and then reproduce, producing gametes at a rate proportional to their relative fitness. The arrows from the second box represent the probability of leaving offspring; the more arrows, the higher the probability. Derived alleles (in red) are favored in Deme 2, and disfavored in Deme 1, but the strength of selection varies. Individual 2 has much lower fitness in Deme 1 because it carries a derived allele at locus E, which is under strong divergent selection. Individual 4 migrates to Deme 2 and is selected against due to being homozygous for allele 1 at locus D, which is also under strong selection. Individual 5 carries two derived alleles, it is homozygous for D2, and also carries one copy of the derived C2 allele, which has essentially no effect on fitness. When individual 5 migrates to Deme 1, even though the derived allele at locus C is under weak selection, because individual 5 is also homozygous for the derived allele at D it leaves few offspring and the derived C allele is not incorporated into the Deme 1.

The authors then followed the fate of individual mutations through time, and examined how three factors changed over time: the cumulative number of mutations that established, the reduction in the effective migration rate (that is, the actual rate of gene flow between populations, once the effect of selection is taken into account), and differentiation, as measured by the value of FST at each locus.

As we might have predicted, the number of mutations established increased over time, and the effective migration rate decreased over time in most scenarios. Also unsurprisingly, these changes accrued more quickly when selection was strong, and when gene flow was low. More interestingly, both of the models incorporating genome-level effects (GH and DH) showed more rapid increases in the number of mutations established compared to the beanbag model, and both showed more rapid decreases in the effective migration rate. However, these values did not differ substantially between the GH and DH models (See Figures 2-5 in Flaxman et al).

Plotting FST at each locus also revealed an unexpected outcome (See Figure 6 in Flaxman et al) – in the GH and DH models each new mutation increased the rate at which subsequent mutations fixed, such that once a few early mutations had established, subsequent mutations fixed almost immediately and overall genetic divergence between the populations increased at an increasing rate. This effect was not seen in the DS scenario, and as with the mutation establishment and effective migration results there was essentially no difference between the GH and DH scenarios. Thus, for the cases Flaxman et al considered there appears to be no additional effect of linkage on speciation with gene flow once loci are packaged into individuals and fitness epistasis is accounted for.

Flaxman and colleagues results are surprising from both theoretical and empirical perspectives. Many authors have argued, using both verbal and mathematical models, that linkage can make speciation with gene flow easier; by creating ‘footholds’ of initial divergence, strongly selected loci can protect neighboring loci from the homogenizing effect of gene flow. Additionally, from an empirical perspective, many recent genomic studies of incipient speciation have identified ‘islands’ of genomic divergence – small, tightly-linked sections of the genome in which genetic differentiation is much higher than the genome average (For review, see Nosil et al. 2009). Flaxman and colleagues did consider whether diverged loci were clustered into ‘islands’ of divergence, and under weak selection some clustering was observed. However, for most values of selection and migration diverged loci were no more clustered than would be expected if they were scattered at random across the genome.

It is unclear to what extent Flaxman and colleagues’ results are peculiar to the particular range of parameters they explored, or whether they might be explained by particular decisions about how to build the model. For example, the GH and DH models used multiplicative fitness schemes, where the fitness of each individual was determined by the product of the fitness effects at every locus. It is possible that a simpler model where the fitness effects were additive might have diminished the differences between the DS (beanbag) and GH models, and this in turn might have allowed a greater role for linkage.

Nevertheless, Flaxman and colleagues’ work brings a novel theoretical treatment to the growing literature on the genomics of speciation with gene flow. Their results confirm, as Mayr predicted more than fifty years ago, that understanding speciation requires that we consider genome-level effects, rather than the simple addition and removal of beans from a beanbag.

Acknowledgements: Thanks to Sam Flaxman and Jeremy Yoder, who both provided useful comments on earlier drafts of this post.

Dylan Goldade is an undergraduate biology major at Willamette University, currently working on his senior thesis in Chris Smith’s lab building a linkage map for Yucca brevifolia using RAD-tag markers. Kathryn Theiss is a postdoctoral scholar in the Biology Department at Willamette University, studying plant reproduction, evolution and conservation. Chris Smith is an assistant professor at Willamette University, who uses experimental and genetic approaches to study coevolution between plants and insects


Feder JL, Egan SP, Nosil P (2012) The genomics of speciation-with-gene-flow. Trends in Genetics. doi: 10.1016/j.tig.2012.03.009.

Flaxman SM, Feder JL, Nosil P (2013) Genetic hitchhiking and they dynamic buildup of genomic divergence during speciation with gene flow. Evolution. doi: 10.1111/evo.12055.

Meyer A. (2005). On the importance of being Ernst Mayr. PLoS Biology 3 (5): e152. doi: 10.1371/journal.pbio.0030152.

Nosil P, Funk DJ, Ortiz-Barrientos D (2009) Divergent selection and heterogeneous genomic divergence. Molecular Ecology 18, 375-402. doi: 10.1111/j.1365-294X.2008.03946.x.

Provine WB (2004) Ernst Mayr: Genetics and speciation. Genetics 167, 1041-1046. PMCID: PMC1470966.

Rundle HD, Nosil P (2005) Ecological speciation. Ecology Letters 8, 336-352. doi: 10.1111/j.1461-0248.2004.00715.x.

Via S (2012) Divergence hitchhiking and the spread of genomic isolation during ecological speciation-with-gene-flow. Philosophical Transactions of the Royal Society B 367, 451-460. doi: 10.1098/rstb.2011.0260.


About Jeremy Yoder

Jeremy Yoder is a postdoctoral associate in the Department of Forest and Conservation Sciences at the University of British Columbia. He also blogs at Denim and Tweed, and tweets under the handle @jbyoder.
This entry was posted in speciation, theory and tagged , , , . Bookmark the permalink.