2012 NGS Field Guide – Table 1 – Instrument Use Grades

Table 1a. Utility of 2nd and 3rd generation DNA sequencing platforms for de novo assemblies of different templates. Table assumes HiSeq2500, Ion Torrent Proton, and Oxford Nanopore achieve stated goals (independent of stated time-lines). Initial letter indicates the author’s opinion of the overall utility (grade) for a platform for a specific application. Utility grades combine data characteristics (amount, quality, length), cost of data, and ease of assembling the data into the final desired product. Major considerations for utility grades are noted.

Platform – instrument Application: de novo assemblies
BACs, plastids, & microbial genomes transcriptome Plant & animal genome
454 – GS Jr. B – good but expensive C – need multiple runs, expensive D – cost prohibitive
454 – FLX+ A – good, need to multiplex to be economical A/B – good but expensive, not best for short RNAs
B/C – good as part of a mixed platform strategy, expensive to use alone
MiSeq B – good, assembly more challenging than 454
B/A – may need multiple runs, assembly more challenging than 454, longer reads may make it the best
C – expensive, use to validate libraries for HiSeq
HiSeq 2000
B/C – more data than needed unless highly indexed; assembly more challenging than 454

A/B – good, assembly more challenging than 454 but much more data available for analyses

A – primary data type in many current projects; requires mate-pair libraries
HiSeq 2500 – rapid run
B – more data than needed unless highly indexed; assembly more challenging than 454

A – good, assembly more challenging than 454 but much more data available for analyses

A – will probably be more expensive than HiSeq2000, but increased read length is worth it
Ion Torrent – 314
C – OK, but reads are shorter than Illumina & as expensive as 454

C – OK, but reads are shorter than Illumina & as expensive as 454
D – cost prohibitive, reads shorter than alternatives
>
Ion Torrent – 318

B – good, data more challenging to assemble than 454 or Illumina

B/C – good, data more challenging to assemble than 454 or Illumina

C – high cost, data more challenging to assemble than 454 or Illumina
Ion Torrent Proton
B – more data than needed unless highly indexed; assembly more challenging than 454 or Illumina

B/A – assembly more challenging than 454, longer reads could make it the best

B/A – cost per MB for Proton II & longer reads could make it the best
minION B – more expensive than GridION B – more expensive than GridION C – more expensive than GridION
GridION
B/A – great for scaffolding, will need to combine with short reads with lower error rate until error rates are reduced

B/A – great for defining full-length transcripts, will need to combine with short reads with lower error rate until error rates are reduced

B/A – great for scaffolding, will need to combine with short reads with lower error rate until error rates are reduced
SOLiD – 5500
C – more data than needed unless highly indexed; assembly more challenging than 454 or Illumina

C/D – short reads make assembly challenging or impossible

C/D – short reads make assembly challenging or impossible
PacBio – RS B – requires high coverage due to high error rates B – expensive, short RNA will be challenging
B/D – strobed reads could be good as part of a mixed platform strategy; cost prohibitive to use alone

Table 1b. Utility of 2nd and 3rd generation DNA sequencing platforms for resequencing applications. Table assumes HiSeq2500, Ion Torrent Proton, and Oxford Nanopore achieve stated goals (independent of stated time-lines). Initial letter indicates the author’s opinion of the overall utility (grade) for a platform for a specific application. Utility grades combine data characteristics (amount, quality, length), cost of data, and ease of assembling the data into the final desired product. Major considerations for utility grades are noted.

Platform – instrument Application: resequencing
Targeted loci Transcript counting Genome resequencing
454 – GS Jr. B – good but expensive, need to limit loci D – cost prohibitive D – cost prohibitive for large genomes
454 – FLX+ B – good but expensive, should limit loci D – cost prohibitive C/D – cost prohibitive for large genomes
MiSeq A/B – good, fewer and higher cost reads than HiSeq B – more expensive than HiSeq or SOLiD C – expensive for large genomes
HiSeq 2000
A – primary data type in many current projects; best for many loci
A – primary data type in many current projects A – primary data type in many current projects
HiSeq 2500 – rapid run A – faster path to leading data type
A/B – likely to be slightly more expensive than with standard flow cell
A – faster path to leading data type
Ion Torrent – 314 C – OK but expensive, need to limit loci D – cost prohibitive D – cost prohibitive
Ion Torrent – 318 B – good, slightly less data per run than MiSeq
B/C – more expensive than HiSeq or SOLiD; new informatics pipelines needed; new error profile
C – expensive for large genomes
Ion Torrent Proton
B/A – could displace HiSeq, but different error profile will inhibit switching
B – similar to 318, but much closer in cost per read
B/A – proton II chip will set new pricing standard, could become leading shorter-read platform
minION C? – error profile may make this less desirable D – probably cost prohibitive C? – expensive for large genomes
GridION B? – error profile may make this less desirable B? – error profile may make this less desirable B? – error profile may make this less desirable
SOLiD – 5500xl A/B – good for many loci and many indexed samples A – primary data type in many current projects A – primary data type in many current projects
PacBio – RS D – requires high coverage due to high error rates D – cost prohibitive D – cost prohibitive for large genomes

Table 1c. Utility of 2nd and 3rd generation DNA sequencing platforms for various applications. Table assumes HiSeq2500, Ion Torrent Proton, and Oxford Nanopore achieve stated goals (independent of stated time-lines). Initial letter indicates the author’s opinion of the overall utility (grade) for a platform for a specific application. Utility grades combine data characteristics (amount, quality, length), cost of data, and ease of assembling the data into the final desired product. Major considerations for utility grades are noted.

Platform – instrument Various Applications
Metagenomics1 Mutation Detection2 Other limitations
454 – GS Jr. A/B – good but costs limit sample number & depth D – cost prohibitive Customer happiness
454 – FLX+ A – good, long reads maximize data per read C – expensive, good for identifying clusters Reliability issues;
MiSeq
B/A – good; shorter reads than 454, but much greater depth; beware phase*
B – more expensive than HiSeq, SOLiD, or Proton Expensive reagents ; Few reagent kit options
HiSeq 2000 B – OK, limited by short reads, beware phase* A – primary data type in many current projects Increasing reagent costs
HiSeq 2500 – rapid run B/A – good, limited by short reads, beware phase* A – primary data type in many current projects What will reagents cost?
Ion Torrent – 314 C – limited by short reads and cost D – cost prohibitive Will longer reads become available? 

Ion Torrent – 318
B – shorter read-length than 454 or MiSeq; no phase issues
B – much more expensive than HiSeq or SOLiD Will 400 base reads become available?
Ion Torrent Proton
B – shorter read-length than 454 or MiSeq, longer reads than HiSeq

B – slightly more expensive than HiSeq or SOLiD until Proton II chip becomes available
Will 400 base reads become available?
minION
B – excellent for environmental sample sequencing; enrichment techniques will need to be developed; field portable; limited by accuracy
? – accuracy is likely limiting for this application

When will this be available?  When will data be publicly available for non-commercial software development?
GridION
B – excellent for environmental sample sequencing; enrichment techniques will need to be developed; limited by accuracy
? – accuracy is likely limiting for this application
When will this be available?  How much will the nodes and reagents cost?
SOLiD – 5500xl D – limited by short reads B – frequent data type in many current projects Limited future
PacBio – RS
D – cost & insufficient accuracy except for short consensus sequence reads

F – cost prohibitive & insufficient accuracy except for short consensus sequence reads
When will new applications be available?

1Metagenomics – characterization of 16S sequences within and among microbial communities, primarily via sequencing amplicons.

2Mutation Detection – identification of rare sequence variants

*Illumina instruments require a mixture of different base signals among clusters during each cycle, thus amplicon sequencing requires strategies to offset the beginning bases of amplicons or use of custom sequencing primers