One of the coolest of reasons that cheap sequencing is nifty, in my opinion, is that it has allowed researchers to study individual eukaryotic organisms, and their associated microbes (their microbiome). Let’s be real, we are in the midst of identifying essential interactions between eukaryotes and their microbes, which are key in driving evolution. If you’ve any doubt about that, feel free to check out this great read, or take a glance at this article.
Contamination introduced by researchers during sampling, DNA extraction, or library prep is a problem. However, microbiologists will likely agree, that host-associated microbes are important (as I’ve covered here, here….here as well as in bees), and aren’t just contamination if a slightly more charismatic organism is the target for whole genome sequencing. Instead, understanding which microbes are associated with the host in question could shed equally important light on forces driving genome evolution and speciation.
A recent article by Gerth and Hurst highlighted the utility of studying non-target organism sequences obtained from next generation sequencing (NGS) experiments. The authors analyzed sequences of the honey bee (Apis sp.) from 993 short read libraries, finding they harbored a minimum of 11% sequences identified previously as microbes associated with Apis sp. In addition to these libraries, they also analyzed 492 RNA sequence libraries and determined approximately half had evidence of viral infections. Gerth and Hurst utilized the data to reconstruct draft genomes of bacteria associated with Apis sp., demonstrating that a closer look at short read sequences identified as contamination could provide information on key associations among different organisms.
“We conclude that ‘contamination’ in short read sequencing libraries can provide useful genomic information on microbial taxa known to be associated with the target organisms, and may even lead to the discovery of novel associations.”
Although it has been previously demonstrated, this study provides eloquent examples showing that the abundance of raw sequencing data available could be used to uncover and characterize microbial interactions with their hosts. Gerth and Hurst fished around in Apis sequence data using sequences from previously identified Apis-associated microbes. They found many libraries harbored Lactobacillus sequences, which clustered in groups previously associated with Apis sp.
“The biological properties of an individual are a composite of the functions encoded in their genome and that of microbial associates, both symbionts and pathogens.”
One interesting tidbits was that some of the sequences did not fall within previously recognized lineages of honey bee-associated Lactobacillus sp. Instead, a new lineage related to other Apis sp. was identified. Indeed, from just one sequencing library they
constructed draft genomes from two previously (not sequenced) microbes: Lactobacillus kunkeii and a Fructobacillus sp. The authors are careful to point out, however, that distinguishing between host microbes and contaminants accidentally introduced, should (of course) be considered carefully.
The fun doesn’t stop there folks. This study also looked at viruses found in RNA sequencing libraries. They were able to use entire viral genomes to search about 500 RNA libraries. In 15% of the libraries, between 5% and 54% of the reads were from viruses. The authors identified a single differentially expressed gene that had not functionally characterized in honey bees, suggesting that it has a role in the bee’s response to viral infections. Indeed, according to the authors, gene expression has been shown to influence Apis sp. beeehavior (sorry…not sorry).
“While genomes gained from contaminated bee samples cannot and should not replace focused microbiological and metagenomic investigations, they might still improve our understanding of honey bee microbiome composition and functioning.”
To understand genome evolution in any organism, we need to identify the key players, including the associated microbes. This study points out that ever-growing available sequence data provides the possibility of low hanging fruit s sorts of studies. Essentially, there’s potential for endless interesting studies focused on the so called ‘contaminant’ sequence fraction. The authors call for the publication of all genomes identified during any NGS experiment, not just the complete genome of the organism of interest.
Daily there are more resources to use to investigate and get a better insight into microbiomes associated with different model organisms. There are plenty of things to work out before these complex systems are understood, that’s where the interesting challenges are. We’ve come a long way since someone could tell a molecular ecologist they knew nothing. It’s not easy analyzing sequences all the time. If it were easy, everyone would do it.
Gerth, M. and Hurst, G.D., 2017. Short reads from honey bee (Apis sp.) sequencing projects reflect microbial associate diversity. PeerJ, 5, p.e3529.