Tag Archives: Hadoop

Riding the Elephant

Posted on 7 Feb, 2011 by Nicholas Crawford

I recently received my first batch of reads from a single paired-end lane run on an [Illumina Hi-Seq](http://www.illumina.com/systems/hiseq_2000.ilmn) instrument. This batch totaled about 20 billion basepairs of DNA sequence, and the associated data files a combined 55.4 gigs of text. … Continue reading →

Posted in bioinformatics, next generation sequencing, software | Tagged Cluster Computing, Hadoop, MapReduce, NGS, Python | 9 Comments

Tag Archives: Hadoop

Riding the Elephant

Subscribe by email

Meta

Tag Archives: Hadoop

Riding the Elephant

Share this:

Subscribe by email

Meta