Because sequencing. With all the affordable genome and metagenome sequencing available, we’ve reached an unprecedented point at which we can profile microbial communities more accurately than ever before. For this reason, it’s essential to develop efficient methods for data analysis. While some researchers are adept at collecting samples, preparing them for sequencing, and analyzing the mountains of resulting data, there can also be an appreciable gap between the wet lab and the bioinformatics analysis side of a project. It’s important that tools are developed that allow for powerful and efficient data analysis, even if you don’t have the strongest background in programming there shouldn’t be a barrier to understanding all of the cool stuff your data has to offer.
Developing tools for data analysis is no small feat, accounting for bias or any issues in the sequence data itself can be just one of the many challenges. However, that being said, there have been some nifty tools to tackle the mountains of sequence data available (such as Anvi’o (Eren et al., 2015) developed by the Meren lab, Kraken (Wood et al., 2014) and CLARK (Ounit et al., 2015)) for the powerful analysis of large data sets, and just today another was published in Genome Biology.
The study by Flygare and colleagues presents “Taxonomer: an interactive metagenomics analysis portal for universal pathogen detection and host mRNA expression profiling”. As the authors highlight, really huge data sets are hampered by long computation times and algorithmic inaccuracies. They push this tool as a method to help in the identification of pathogens across broad geographic scales. In particular, they point out that we have an opportunity like never before to link microbial communities to human health and disease.
Advances in RNA sequencing have also assisted in enabling pathogen detection and shifts in host expression in response to the pathogens. These developments have the potential to enhance disease diagnosis and treatment. Moving away from PCR amplification of marker genes and toward microbiome studies gets rid of some biases introduced by this method.
Taxonomer presents itself as a fast, easy to use, web-based metagenomic sequence analysis tool for DNA or RNA sequences. It claims to be the most comprehensive taxonomic profiling tool around and also very, very fast. It could potentially also enhance pathogen detection and strives to make high-quality data analysis accessible to non-bioinformatic specialists. The newest versions of these up and coming web-based tools and their attempts to more accurately and quickly analyze large sets of sequence data demonstrate that we are headed in the right direction.
Eren AM, Esen ÖC, Quince C, Vineis JH, Morrison HG, Sogin ML, Delmont TO. Anvi’o: an advanced analysis and visualization platform for ‘omics data. PeerJ. 2015 Oct 8;3:e1319.
Flygare S, Simmon K, Miller C, Qiao Y, Kennedy B, Di Sera T, Graf EH, Tardif KD et al. Taxonomer: an interactive metagenomics analysis portal for universal pathogen detection and host mRNA expression profiling. Genome Biology. 2016 May 26; 17:111. DOI: 10.1186/s13059-016-0969-1
Ounit R, Wanamaker S, Close TJ, Lonardi S. CLARK: fast and accurate classification of metagenomic and genomic sequences using discriminative k-mers. BMC Genomics. 2015;16:236.
Wood DE, Salzberg SL. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol. 2014;15:R46.