I’m a late adopter of DNA barcoding. As a botanist it has often felt that DNA barcoding wasn’t really for me. Unlike in animals, where the mitochondrial gene CO1 often tracks species boundaries, in plants, there is rarely an exact match between DNA barcode sequence and plant species identity. A more general issue is that the use of one or a few regions of non-recombining organellar DNA just doesn’t cut it for answering the population genetic questions I’m most interested in.
But it’s now becoming clear that the scalability of DNA barcoding that allows it to be used on hundreds or thousands of specimens at a reasonable cost may make it a primary tool to accelerate species discovery and to describe biodiversity patterns in the face of massive species extinctions. Perhaps equally important to me is that the plant DNA barcode isn’t set in stone and new sequencing technologies will allow us to find better options for using DNA to tell plant species apart (Hollingsworth et al., 2016).
Given my new-found enthusiasm for DNA barcoding, last month I went to the 8thInternational Barcoding of Life Conference in Trondheim, Norway, to find out the new developments in this field. Here’s what I learnt:
DNA barcoding has found its place in the genomics era. What’s the point of sequencing a few genes when we can now sequence whole genomes? That’s the question on my mind when I arrive, and I was pleased to see many good answers at the meeting. The most convincing one is that DNA barcoding is perfectly well-suited for discovering species in some of the most neglected animal groups. Dan Janzen gave a superb example of how DNA barcoding is being used to discover Costa Rica’s insect diversity on a massive scale, while many speakers highlighted the use of DNA barcoding for unearthing new species in the marine realm. In these environments, complete genome sequencing would be overkill, too expensive, and often poorly suited to very small samples. The scalability and applicability of DNA barcoding for species discovery and documenting and monitoring biodiversity are part of the motivation behind BIOSCAN, a major new initiative launched at the meeting. BIOSCAN’s three research themes will employ DNA barcodes to speed species discovery, to probe species interactions, and to track species dynamics. At a cost of $180 million and involving hundreds of research scientists, the project will not only build a more comprehensive reference library of DNA barcode sequences, but tackle major research questions about complex and cryptic species interactions, and the spatial scale that biodiversity is partitioned (including in often overlooked aquatic and soil systems). I’m excited to see what they find.
There’s exciting new technology. I love a new gadget or an exciting piece of technology. Paul Herbert showcased the remarkable LabCyte Echo 525—which is every lab scientists dream: a liquid handling system that eliminates plastic waste and pipetting. It uses acoustic energy to dispense reagents rather than pipetting. The motivation behind using this was to put a stop to the mountain of plastic waste produced in highly multiplexed DNA barcoding. This is good for the environment and for reducing costs, particular now that plasticware is a bigger cost than sequencing for multiplexed DNA barcoding on the PacBio Sequel 2. My other favourite bit of kit on show at the main meeting was the Bento Lab, a beautiful portable piece of equipment combining a centrifuge, PCR machine, and gel visualisation in one portable box. This goes a long way towards portable genomics, especially if used in conjunction with the various portable sequencers produced by Oxford Nanopore Technologies, such as the forthcoming smartphone sequencer the SmidgION. However, for my purposes, I’d still struggle to get good quality DNA extractions for plant samples using current protocols and the Bento Lab, and I’m waiting for someone to come up with an easy field protocol for high molecular weight DNA extraction from plant samples.
DNA barcoding has gone (mega)genomic. It comes as no surprise that every aspect of DNA barcoding has gone genomic. But what did surprise me is that it’s gone genomic in a range of smart ways where it’s now more reliable to infer biodiversity patterns. For example, Pierre Taberlet showed that the plummeting costs of sequencing allow metabarcoding studies to have high replication and multiple positive and negative controls (Zinger et al., 2019). Inger Greve Alsos showed how genome skimming can be used to generate complete plastid genomes for thousands of plant samples from Scandinavia, giving greatly improved taxonomic resolution over the current plant DNA barcodes. Linda Neaves showed how high-throughput sequencing of panda faecal samples can be used to detect rare components of their diet. Overall, there were numerous good examples where masses of genomic data have helped the study of biodiversity in interesting ways.
Quantifying species diversity in mixtures remains difficult. There is real interest in quantifying the abundance of different organisms in mixed samples. Phylogeographers would like to know the abundance of different pollen types in ancient sediments, clinicians need to know the exact composition of natural medicinal compounds, and ecologist would like to trace diet composition of herbivores over space and time. But quantifying DNA in mixed sample is fraught with difficulties. Different species and tissue types often persist differently in a given environment (e.g. DNA of certain resilient plant material may remain more intact than other species in a faecal sample), while the representation of different species in a sequencing library will be affected by differential template amplification. I had hoped that someone may have found a solution to some of these problems but my impression is that people are presenting relative read count and using this as a proxy for relative abundance. There is good work going on in this area, and I was interested to see research where people give herbivores a known diet, then estimate diet composition from sequence data generated from the faecal samples, to calibrate quantification from DNA barcode data. But in general it seems that reliable quantification remains a major challenge and there’s lots still to do.
There’s a gap between DNA barcoding and ecological and evolutionary research. The only disappointment at the meeting was that, in general, the big data being generated isn’t being placed in the broad conceptual framework of ecological and evolutionary research. For example, throughout the meeting there were many cases of researchers generating large DNA barcode datasets and then comparing diversity between geographic sites. There are real opportunities to do this in an ecological or evolutionary context, building on classic theory, and using well-developed statistical approaches. But unfortunately I didn’t see much of that. Instead, most scientists presented descriptive findings of species counts and new taxa. My hope is that as the datasets (and replication in metabarcoding) grow there’ll be more connected thinking and interaction with ecological and evolutionary researchers.
Hollingsworth, P. M., Li, D.-Z., van der Bank, M., & Twyford, A. D. (2016). Telling plant species apart with DNA: from barcodes to genomes. Phil. Trans. R. Soc. B, 371(1702), 20150338.
Zinger, L., Bonin, A., Alsos, I. G., Bálint, M., Bik, H., Boyer, F., Deagle, B. E… Taberlet, P (2019). DNA metabarcoding—Need for robust experimental designs to draw sound ecological conclusions. Molecular Ecology, 28(8), 1857-1862.