Should I use FST, G'ST or D?

I have heard many researchers extolling one estimator over another, saying papers using any other approach should be rejected without review. Yet, although numerous recent papers assess various population genetic parameters and their validity in analyzing population structure, gene flow, and divergence, no clear consensus has been reached. FST is probably the most widely used measure of genetic distance between populations, along with the closely related estimator GST, but there are now also RST, G’ST, ΦST, D and even G”ST, as well as many others. FST has a big advantage in terms of familiarity – it has been around long enough that we have some idea of what its properties are. It can be thought of as the reduction in heterozygosity due to population structure, or (for biallelic markers) the variance in allele frequencies among populations. It behaves in predictable ways in response to particular circumstances.

FST was originally formulated to measure genetic distance using biallelic markers (Wright 1969), but was generalized for multiple alleles (Nei 1973), referred to as GST (although confusingly, the terms GST and FST are often used interchangeably; here I will generally use GST). However, numerous recent papers (e.g. Hedrick 2005, Jost 2008, Jost 2009, Meirmans and Hedrick 2011) have pointed out a difficulty in interpreting GST. With two populations and two alleles, GST ranges from 0.0 to 1.0, as expected, with 0 representing no differences in allele frequencies between two populations and 1.0 indicating that the two populations are fixed for alternate alleles. With more than two alleles, however, GST cannot reach 1.0 even when no alleles are shared between the two populations, as there will always be some heterozygosity within populations. This is not a flaw in GST, but it does indicate that we have to be careful when interpreting GST or comparing GST for different types of markers.

For many situations, certainly, this can be quite problematic – for microsatellites with high heterozygosity, maximum GST is often 0.1-0.2! Clearly, in these cases Wright’s (1978) guidelines are entirely misleading, when he states that values ranging from 0-0.05 indicate “little” genetic differentiation; 0.05-0.15 is “moderate”, 0.15-0.25 is “great”, etc. This is only plausible for biallelic cases, and in other situations we cannot rely on such simple rules of thumb.

To account for the variation in the maximum obtainable GST, Hedrick (2005) proposed a “standardized” measure, G’ST, calculated by dividing GST for a given marker by the maximum theoretical GST based on the heterozygosity at that marker. Additionally, in 2008 Jost introduced another measure of differentiation, D, which measures the fraction of allelic variation among populations. Both G’ST and Jost’s D will be 1 at complete differentiation (even with high variation within populations) and are zero with no differentiation, and both statistics have other intuitively appealing properties.

Figure 1. Effect of migration rate, μ=0.001 from Whitlock 2011, Fig. 1.

Both measures, though, have problems, and can be just as difficult to interpret as GST. G’ST and Jost’s D behave quite well in some cases, e.g. high mutation rates and two populations (Fig. 1, left), closely tracking the “true” divergence as measured by coalescent FST. However, while GST is limited to low values when heterozygosity is high, G’ST or D are in many cases biased upwards, rarely falling much below one even with high migration rates when there are many populations (Fig. 1, right). So, neither G’ST or D should be used as a proxy for migration rates in most situations. Additionally, D is highly affected by mutation rates so it is hard to compare multiple loci, or even to compare the same locus in different species if mutation rates differ. Nevertheless, if allelic differentiation at a particular locus is the value of interest, it appears that D is the best measure.

GST has a fairly straightforward relationship to gene flow and mutation rate, with patterns driven by migration when mutation rates are low relative to migration rates. In many cases gene flow can be safely assumed to be high relative to mutation rate, so in much of the literature GST is used to assess migration rates. This may be true for DNA sequence polymorphisms, where mutation rates are typically on the order of 10-9-10-8. In contrast, however, some markers such as microsatellites, which are widely used in population genetic studies due to their high variability, have mutation rates ranging from 10-6-10-3. As researchers often screen for the most variable markers, many studies are likely relying on microsatellites on the high end of this range. At these high mutation rates, D and G’ST will be very close to one and GST will be close to zero for most markers, regardless of migration rates (Fig. 2). What to do?

Figure 2. Effect of mutation rate. (Whitlock 2011 Fig.2)

It is becoming quite clear that high-mutation markers are not a good choice if one wants to calculate GST or any related measure, except in very particular situations. RST and ΦST can be used to account for mutation rates if the assumption of step-wise mutations is met (see for example Kronhom et al. 2010), but other estimates of population differentiation are either misleading or uninformative. Highly variable microsatellites are great for some things: genetic mapping, looking for recent selective sweeps, or parentage analysis, for example, but not for calculating GST.

When using such markers, if selection or migration are really the population genetic features that one wants to measure, it may be best to evaluate these parameters directly rather than using something like GST. One can use coalescent-based likelihood methods such as IM or MIGRATE to estimate migration rates from the data if migration is of interest. Selection, too, can be assessed more productively using other approaches (e.g. lnRH, which tests for selective sweeps) rather than trying to identify GST outliers. One additional solution frequently recommended for assessing population structure using high-mutation markers is to calculate both GST and D. Markers where GST underestimates divergence should have significantly elevated values of D. Where this pattern is observed, allelic variation can be investigated in greater detail. However, more work needs to be done on the interpretation of these two measures in concert.

Over the past few decades researchers have increasingly used microsatellites, due to their high level of variability and the relative ease of development and scoring in non-model systems. However, now that next-generation sequencing is getting more affordable, sequence-based markers can be assessed throughout the genome (e.g. using RAD sequencing). As we move back towards such low-mutation-rate markers as SNPs, FST becomes easier to assess reliably. On the other hand, FST and other current methods are all designed to assess one or a few markers at a time, and genomic approaches just apply these methods thousands or tens of thousands of times for markers throughout the genome. One can look for outliers, calculate means, etc., without really taking full advantage of the data. For instance, I have seen bi-modal or skewed distributions of FST and other summary statistics; clearly means and standard deviations can be misleading in these cases. My hope is that new methods for assessing divergence will focus not on individual loci but on many markers throughout the genome.
For more on this, the articles listed below are the source of most of the ideas presented here. I particularly recommend the excellent recently-published article Whitlock (2011).


Hedrick PW (2005) A standardized genetic differentiation measure. Evolution, 59, 1633–1638. Link

Jost L (2008) GST and its relatives do not measure differentiation. Molecular Ecology, 17, 4015–4026. Link

Jost L (2009) Reply: D vs G’ST: response to Heller and Siegismund (2009) and Ryman and Leimar (2009). Molecular Ecology, 18, 2088–2091. Link

Kronholm I, Loudet O, de Meaux J (2010) Influence of mutation rate on estimators of genetic differentiation—lessons from Arabidopsis thaliana. BMC Genetics, 11, 33. Link

Meirmans PG, Hedrick PW (2011) Assessing population structure: FST and related measures. Molecular Ecology Resources, 11, 5–18. Link

Michalakis Y, Excoffier L (1996) A generic estimation of population subdivision using distances between alleles with special reference for microsatellite loci. Genetics, 142, 1061–1064. Link

Nei M (1973) Analysis of gene diversity in subdivided populations. Proceedings of the  National Academy of Sciences. 70, 3321–3323. Link

Whitlock M (2011) G’ST and D do not replace FST. Molecular Ecology, 20, DOI: 10.1111/j.1365-294X.2010.04996.x Link

Wright S (1969) Evolution and the Genetics of Populations, Vol. 2. University of Chicago Press, Chicago.

This entry was posted in methods, population genetics and tagged , , . Bookmark the permalink.