Genomic diversity and secondary contact

Under a divergence, or isolation model, the genomes of individuals in a daughter-population are expected to harbor greater differentiation relative to its sister-population, and lower differentiation within the population (after sufficient time since divergence). Divergence thus is a mechanism of allopatric speciation, with strong selection on regions of the genome that are involved, or implicated in furthering reproductive isolation, and adaptive evolution. These regions would be expected to show low(er) levels of differentiation, compared to the rest of the genome. If secondary contact occurs between diverged sister-populations, regions of relatively lower differentiation are possibly introgressed. In common practice, two measures of differentiation have been used to identify these “genomic islands of speciation” – Fst, and Dxy. While Fst is a relative measure of allele frequency differences between populations, Dxy is a measure of absolute differences in nucleotides (or repeat lengths for short tandem repeat (STR) markers). While both measures have issues with interpretation (see Cruickshank and Hahn (2014) for an excellent review), Fst is particularly sensitive to recent secondary contact.

Gmin versus Fst estimates in 50 kb windows across the X chromosome of populations of D. melanogaster. Shaded regions indicate previously detected introgression. Image courtesy – Figure 4 of Geneva et al. (2015).

Geneva et al. (2015) describe a statistic called “Gmin” using haplotype data, which is a relative Dxy measure – a ratio of the minimum Dxy between haplotypes from sampled populations, and the mean Dxy between populations to subvert the issue of misinterpretation of Fst in the event of secondary contact. Through simulations, and the study of a Drosophila dataset (Rwandan populations of D. melanogaster), Geneva et al. show that Gmin (a) increases with divergence time of the two populations, but plateaus faster than Fst, (b) has increased sensitivity and specificity for all compared simulations, with greatest sensitivity if the time of secondary contact is very recent, (c) has an expanded range, compared to Fst, making it easier to delineate the presence of recent introgression, as shown by the ranges detected in D. melanogaster (0.0982 < Gmin < 0.9833, versus 0.0170 < Fst < 0.5107).

However, as cautioned by Cruickshank and Hahn (2014), Geneva et al. also highlight the importance of checking for the presence of unusually low (or high) values of absolute divergence, in regions of low Gmin (or Fst) to avoid misinterpretation.

However, in cases of recent secondary contact, and when the rates of gene flow are not extremely high, we have shown that Gmin performs well and is more reliable than Fst


Geneva, Anthony J., et al. “A new method to scan genomes for introgression in a secondary contact model.” (2015): e0118621. DOI:

Cruickshank, Tami E., and Matthew W. Hahn. “Reanalysis suggests that genomic islands of speciation are due to reduced diversity, not reduced gene flow.” Molecular ecology 23.13 (2014): 3133-3157. DOI:


About Arun Sethuraman

I am a computational biologist, and I build statistical models and tools for population genetics. I am particularly interested in studying the dynamics of structured populations, genetic admixture, and ancestral demography.
This entry was posted in evolution, genomics, next generation sequencing, population genetics, theory and tagged , , . Bookmark the permalink.