Totally RAD, Part 2


Edit (8/20/15): I used the wrong web address for Kimberly Andrews! Go check out her work here. Sorry Kim!
Restriction site-associated DNA sequencing (RADseq) is quickly becoming the go-to methodology for collecting population genetic data, and the methodological difficulties of a technique that is exploding in popularity are coming along with it.
Last month, Stacy pointed you towards a review of RADseq protocols that detailed some methodological differences, but of course, there is always more detail out there. In the most recent issue of Molecular Ecology, Kimberly Andrews and colleagues provided a reply to the Puritz et al. paper, adding some additional clarity to the nuances that separate the different RADseq protocols.
Specifically, Andrews and colleagues go into more depth considering the consequences of PCR duplicates, the product of amplification biases during PCR.

The impact of PCR duplicates on population genomics analyses has not been quantified in the literature, but high frequencies of duplicates are expected to impact analyses by falsely increasing homozygosity and by making PCR errors appear to be true alleles (false alleles, Pompanon et al. 2005).

The simplest way to deal with this problem, as well as avoiding other issues of fragment size bias, is to make fragments different sizes from the beginning.

…the most straight-forward method currently developed for identifying RADseq PCR duplicates can only be used for data generated using methods that have a random-shearing step and also generate paired-end sequences (PE-RADseq). For these methods, PCR duplicates can be identified as fragments that are identical in insert length and sequence composition, because random shearing ensures that fragments at a given locus are unlikely to be of equal length unless they are duplicates

Unfortunately for RADseq protocols without a random-shearing step (which is most), there is currently no well-supported way to correct this issue. However, you can bet that there are a number of approaches in the works.
Lastly, the authors reiterate an important consideration for anyone who is considering RAD-seq data as an option for answering the scientific question of their choice: think hard about costs and technical complexity. Depending on whether you have the option to pool samples or not, resources devoted to a project can vary widely.
Welcome to the RAD fad. Better get a big cup of coffee, because you’ve got a lot of reading to do.
 
Andrews K.R., Michael R. Miller, Brian Hand, James E. Seeb & Gordon Luikart (2014). Trade-offs and utility of alternative RADseq methods, Molecular Ecology, n/a-n/a. DOI: http://dx.doi.org/10.1111/mec.12964
Pompanon F., Eva Bellemain & Pierre Taberlet (2005). Genotyping errors: causes, consequences and solutions, Nature Reviews Genetics, 6 (11) 847-846. DOI: http://dx.doi.org/10.1038/nrg1707

This entry was posted in genomics, Molecular Ecology views, next generation sequencing, population genetics and tagged , , . Bookmark the permalink.