The almighty CRISPR-Cas9 technology: How does it work?

CRISPR-Cas9 took the whole world of biology by storm. Selected Science’s 2015 Breakthrough of the Year, the CRISPR-Cas9 technology is revolutionizing science. Within five years of the official announcement (Jinek et al. 2012), it became the genome-editing technique of choice. The secret? It’s easy, cheap and precise.

Crystal Structure of Cas9 in Complex with Guide RNA and Target DNA (Nishimasu et al. 2014)

Crystal Structure of Cas9 in Complex with Guide RNA and Target DNA (Nishimasu et al. 2014)

Step by step, the CRISPR-Cas9 technology is infiltrating scientific fields. One of the fields where this technology might turn out to be a game-changer is conservation biology. But the future directions of CRISPR-Cas9 implementation to conservation practice have to be thoroughly considered and discussed. Only during the last twelve months, major scientific journals like Nature Review Genetics, Cell, Trends in Biotechnology, and PNAS published reviews on genome engineering for conservation purposes.

So, if phrases like gene drive, guide RNA, and protospacer make your head spin, this post is for you. If you didn’t know they existed, it’s for you too. And even for you who have no idea what the acronym CRISPR actually means. Like me until yesterday.

Continue reading

Posted in evolution, genomics, methods, theory | Tagged , , , | 1 Comment

How Molecular Ecologists Work: Hopi Hoekstra on hiring good people, setting the tone, and remembering sushi orders

Welcome to How Molecular Ecologist Work! Today I’m starting our bonus interviews with Dr. Hopi Hoekstra, Professor of Zoology at Harvard University. Hopi and her lab study the mechanisms of adaptation in the wild and in the laboratory, and she is one of the newest electees to the National Academy of Sciences. Lucky for us, she’s happy to share how she works. Continue reading

Posted in career, interview | Tagged | 1 Comment

Respect the old but seek out the new: Direct 16S rRNA-seq from bacterial communities

Adapted from Woese Bacterial evolution 1987

Adapted from Woese Bacterial evolution 1987

I think it’s fair to say that it’s an ongoing struggle to figure out what the heck microbes are doing in their natural environments, and who those microbes are. Clearly, there is no silver bullet that gives us all the easy answers. Sequencing and comparison of 16S rRNA gene sequences have become a routine method for surveying microbes in a sample, since these sequences are made up of regions that are incredibly conserved as well as others that are “hypervariable”, allowing distantly related taxa to be compared, while still resolving closely related lineages.

About ten bajillion (slight under exaggeration) studies have been published using the 16S rRNA gene, which has many times been referred to as the new “gold standard” to classify strains and identify species. The microbial ‘species’ concept thing is a can of worms we don’t need to open up right now, the main point here is that the 16S rRNA is widely used and helpful for characterizing microbial communities.

Continue reading

Posted in community ecology, microbiology, RNAseq | Tagged , , , | Leave a comment

How Molecular Ecologists Work: John McCormack on luck, not closing doors, and just a touch of hustle-and-bustle

Welcome to the next installment in the How Molecular Ecologists Work series!

This entry is from Dr. John McCormack, assistant professor at Occidental College. John is a member of the team that pioneered the use of ultraconserved elements, and his lab at Occidental combines museum-based data with modern molecular methods to ask questions about biological diversity and evolution. Continue reading

Posted in career, interview | Tagged | Leave a comment

Is equilibrium out of reach or are there some sneaky bouts of sex?

Reproductive systems impact the evolution of genetic diversity at the population level. Yet, we don’t know a lot about organisms that are partially clonal, despite the large component of biodiversity that dabbles in asexual reproduction to varying degrees.


Clonal dynamics are interesting from a purely intellectual level, but are also quite important from an applied perspective as cultivated species, pathogens and invasive species often undergo asexual reproduction. What are the long term impacts on their evolvability?

The conclusions we draw from studies on partially clonal species depends on a

correct understanding of the effects of their reproductive mode on the genetic composition of their populations (Reichel et al. 2016).

Previously, I outlined some of the genetic hallmarks that can hint at clonality (see here and here). One of these signatures is heterozygote excess, in which heterozygosity is theorized to increase with asexual reproduction, though there are not many empirical tests of this hypothesis (but see Halkett et al. 2005, Guillemin et al. 2008, Krueger-Hadfield et al. 2016).

Negative Fis has been previously used to demonstrate exclusive clonality (e.g., Balloux et al. 2003), but other studies have hinted at the impact of temporal dynamics on Fis in natural populations (Stoeckel and Masson 2014). Indeed, in natural populations, both negative and positive Fis values are observed.

Reichel et al. (2016) explore the joint effects of partial clonality, mutation and genetic drift on Fis under increasing rates of clonality in BMC Genetics.

Based on their mathematical models, they argue for a dynamic interpretation of Fis. Negative values cannot alone be used an unequivocal evidence of extremely rare sexual events. likewise, non-negative Fis, including Fis=0, isn’t such a reliable indicator of an absence of clonality.

It will be necessary to provide complementary observations, such as the frequency distribution of multilocus genotypes and population history, with time series data in order to discriminate between different hypotheses on the frequency of clonality when mean Fis deviates from zero and when there is large variation of Fis across loci. In addition, an increase in loci is necessary for partially clonal versus exclusively sexual populations. This might be achieved by moving from population genetics to population genomics.


Balloux et al. (2003) The population genetics of clonal and partially clonal diploids. Genetics 164:1635–44.

Halkett et al. (2005) Tackling the population genetics of clonal and partially clonal organisms. TREE 20:194-201.

Guillemin et al. (2008) Genetic variation in wild and cultivated populations of the haploid-diploid red alga Gracilaria chilensis: How farming practices favor asexual reproduction and heterozygosity. Evolution, 62, 1500–1519.

Krueger-Hadfield et al. (2016) Invasion of novel habitats uncouples haplo-diplontic life cycles. Mol Ecol 25: 3801-3816.

Reichel et al. (2016) Rare sex or out of reach equilibrium? The dynamics of FIS in partially clonal organisms. BMC Genetics 17:76.

Stoeckel and Masson (2014) The exact distributions of FIS under partial asexuality in small finite populations with mutation. PLoS One 9:e85228.

Posted in bioinformatics, evolution, genomics, natural history, next generation sequencing, theory | Tagged , , , | Leave a comment

A Primer on the Great BAMM Controversy

Update, 26 August 2016, 2:30PM. A number of readers brought my attention to a series of blog posts by Moore et al. responding to Rabosky’s rebuttal of their published critique of BAMM. I’ve included links to the posts and summarized their contents below. 

A fundamental problem in evolutionary biology is detecting patterns of variation in rates of lineage diversification and working to understand their causes. One recent statistical method to detect if and where diversification rates have changed across the branches of a phylogeny is known as BAMM, an acronym for Bayesian Analysis of Macroevolutionary Mixtures. Over the course of its short life time, BAMM has proven extraordinarily popular — Google Scholar shows 182 citations for Rabosky’s 2014 paper describing the method alone. BAMM works by using a Bayesian statistical framework and MCMC implementation to identify the number and location of diversification-rate shifts across the branches of a tree and the associated diversification-rate parameters (speciation, extinction, and time dependence) on each branch. In doing so, it provides a number of improvements over earlier software: it is based on an explicit model of of how diversification rates shift, it features a complex and realistic model of branching, and it quantifies statistical uncertainty (rather than only providing fixed point estimates of parameters).


However, beginning with a heavily-attended talk at this year’s Evolution Meetings in Austin, TX, Brian Moore and colleagues have raised a number of important questions about the performance and reliability of BAMM, begining a dialogue with Rabosky and the other BAMM developers. What follows is a slimmed-down (although admittedly still complex!) summary of both sides’ major points, drawing on Moore et al.’s recent PNAS paper and rebuttals posted in the BAMM documentation.

Continue reading

Posted in evolution, phylogenetics, software, speciation | Tagged , , , , , , | 4 Comments

The trouble with PCR duplicates

The sequencing center just sent your lane of Illumina data. You’re excited. Life is great. You begin to process the data. You align the data. You check for PCR duplicates. 50 percent. Half of your data is garbage. Everything is horrible. Life is horrible. How did this happen!?!

PCR duplicates are a headache… if a headache were costing you hundreds/thousands of dollars in wasted sequencing. However, they’re an inevitable part of life when using PCR during Illumina library prep. We can define a PCR duplicate as any two reads that came from the same original DNA fragment. These are a problem because they falsely increase homozygosity.  I’ve recently spent way too much time thinking about how these duplicates arise, how we might minimize them, and generally trying to understand what the heck is going on during library prep and sequencing.

In this post I’ll be walking through some RAD data I’ve recently generated (some of these findings could apply to whole genome sequencing, though most of the issues would be far less likely). I’ll focus on the original RAD approach and will assume everyone is somewhat familiar with this method, but see Andrews et al., 2016 for an overview. Hopefully some of this will be useful to others.

Continue reading

Posted in bioinformatics, genomics, methods, next generation sequencing | Tagged , , , | 3 Comments