There are typically two main solutions to this problem.
- The first is sequence capture, a method that uses designed oligonucleotides to isolate specific regions of the genome. This method, while quite accurate in the genomic regions it can pull down, is relatively expensive, slow, and not scalable for use with many samples.
- The alternative method is a restriction enzyme-based approach (RAD, ddRAD, genotyping by sequencing, etc.). These methods use restriction enzymes to pull down only a portion of the genome and are relatively flexible in the degree to which they reduce complexity. While RAD approaches allow numerous samples to be genotyped at low cost, it is difficult to control the exact location and number of loci that will be sequenced.
But don’t worry, the RAPTURE is here! In their recent paper, Ali et al. introduce a protocol called Rapture that combines RAD sequencing with sequence capture (RAD Capture) and is currently my favorite portmanteau. What is this method you might ask? Well, it is a way to genotype hundreds to thousands of individuals at hundreds to a few thousand loci quickly and cheaply.
Here’s how it works. The RAD prep involves a two step barcoding process: 1) a set of 96 RAD barcodes are used to identify each individual within a library; 2) these 96 individuals are pooled and prepped as one library with an Illumina barcode. This allows for the multiplexing of libraries of 96 individuals where individuals are unique based on a combination of their RAD and Illumina barcode. This method isn’t new, though the authors have improved the basic RAD prep to increase on target reads using a clever method involving biotinylated RAD adaptors and have significantly reducing clonal reads. What is novel is that these rad libraries are then pooled and reduced further through hybridization to oligonucleotide baits that correspond to selected RAD sites.
The beauty of this approach is that thousands of individuals can be multiplexed following the RAD preps to allow for one sequence capture reaction. The resulting captured DNA is then sequenced on one lane (or really, part of a lane) of the Illumina platform.
The authors assessed the efficiency of this method using 288 rainbow trout samples with 500 sequence capture baits. These samples were prepped for both standard RADseq and Rapture and sequenced. The sequence capture was performed at full reaction strength and 1/5 reaction strength, to simulate higher concentrations of individuals (5 x 288 = 1440 individuals). Rapture resulted in an average sequence coverage of 16x as compared to 0.43x for RADseq and only required 100,000 sequenced fragments per individual to confidently call ~600 SNPs present in the first 84 bases following the cut site (see figure below). To put this sequencing effort in context, my last lane on an Illuina Hiseq 3000 produced 350 million paired end reads, which, in theory, means a library of 96 individuals would require about 3% of a lane of sequencing.
In the spirit of transparency, I should note that I’m currently using this method to assess parentage in ~2,500 individuals. While this means I’m no impartial voice, I do think that Rapture provides a method that makes large scale genotyping projects feasible. From a monetary perspective, costs include RAD prep (~$2/sample), sequence capture (~$200/reaction), and actual sequencing, which means 1,000 individuals will cost around $2,200 plus sequencing. Not to mention the massive amount of time saved when you don’t have to run a bunch of microsats for each of your 1000 samples.
Ali, O. A., O’Rourke, S. M., Amish, S. J., Meek, M. H., Luikart, G., Jeffres, C., & Miller, M. R. (2015). RAD Capture (Rapture): Flexible and efficient sequence-based genotyping. Genetics, 115. DOI:10.1534/genetics.115.183665