Essentially, all models are wrong, but some are useful. — George Box
Publication of the Li and Durbin’s 2011 paper titled “Inference of human population history from individual whole-genome sequences” was a milestone in the inference of demography.
By allowing the estimation of population dynamics from a single diploid genome, Li and Drubin’s pairwise sequentially Markovian coalescent (PSMC) model is perfectly suited for the genomic era of “less is more”, i.e. sequencing whole genomes of a few individuals rather than sequencing few loci of many individuals.
“The distribution of the time since the most recent common ancestor (TMRCA) between two alleles in an individual provides information about the history of change in population size over time.” (Li and Durbin 2011)
PSMC uses the coalescent approach to estimate changes in population size. Each diploid genome is a collection of hundreds of thousands independent loci. Estimating TMRCA of the two alleles at each locus is used to create a TMRCA distribution across the genome. And since the rate of coalescent events is inversely proportional to effective population size (Ne), PSMC identifies periods of Ne change. For example, when many loci coalesce at the same time, it is a sign of small Ne at that time.
This approach is becoming extremely popular in whole genome studies and is of a particular interest in ancient DNA and conservation genomics. Among others, it has been applied to study demographic history of the giant panda (Zhao et al. 2012), passenger pigeon (Hung et al. 2014) and the woolly mammoth (Palkopoulou et al. 2015).
However, PSMC has several considerable limitations that should be kept in mind.
- It doesn’t recover sudden changes in Ne
- Nor does it recover recent changes, e.g. younger than 10,000 years BP in humans (Li and Durbin 2011).
- Simulation suggest that it also performs worse in case of very ancient changes in Ne (Mazet et al. 2015).
- Using incorrect mutation rate or generation time can cause bias in the interpretation.
- The change in Ne in a PSMC plot can be actually caused by population structure.