转贴


所有跟贴·加跟贴·新语丝读书论坛

送交者: starfire 于 2009-12-03, 22:48:36:

回答: 看这个 由 starfire 于 2009-12-03, 22:40:26:

PacBio Reveals Commercial Specs; Initial Focus on Long Reads, Short Runs, Low Experiment Cost
November 17, 2009
Type size:
-
+EmailPrinter-friendly versionRSS FeedView on one pageBy Julia Karow

Pacific BioSciences this week revealed a number of performance specifications for its first commercial single-molecule real-time DNA sequencer, due to be released during the second half of 2010, as well as a roadmap for expected improvements to the system through 2014.

While several competing sequencing platforms are focusing on large numbers of reads and low cost per base, PacBio is initially emphasizing the comparatively long read lengths, short run time, low cost per experiment, and different applications that its single-molecule analysis system will provide.

Following the launch next year, the company projects that over the next two years, through reagent and software changes, the system's read length, throughput, and speed will increase, while the cost per base will fall. In addition, applications other than DNA sequencing will become available during that time. In 2013, PacBio plans to start testing an upgraded version of its system with an improved sensor capable of monitoring more than ten times the number of reads and faster reactions. That system will also support additional applications.

"The inherent economies and capabilities of third-gen [sequencing] will expand the current market," PacBio CEO Hugh Martin told In Sequence . "It's not necessarily going to be at the expense of, say, Illumina, but it's going to be in addition" to existing second-generation sequencing systems.

PacBio plans to formally announce specs for its commercial system at the Advances in Genome Biology and Technology conference in February, but provided some numbers for In Sequence this week. The first version will run arrays, or "sets," with 80,000 zero-mode waveguides each, tiny reaction chambers in which single DNA molecules are synthesized.

This represents an almost 30-fold increase over the firm's current prototype, which has 3,000 zero-mode waveguides per chip, according to Martin. At the moment, about a third of the ZMWs can be loaded with one active DNA polymerase enzyme each, which can be used to obtain DNA sequence data.

A so-called SMRT cell, filled with a batch of sequencing reagents, allows users to analyze either one or two zero-mode waveguide sets, both loaded with the same DNA library. A SMRT cell with reagents will have a list price of $99.

Running two sets instead of one per SMRT cell increases the run time, since they are analyzed sequentially, but requires no additional reagents, so the cost per base is cut in half. Future versions of the system will be able to run even more sets per cell. "Our pricing strategy, at this time, is that we are going to probably fix that $100 per-run cost," Martin said. "Over time, that price is going to remain the same, but the amount of sequence that you will get for that $100 will go up tremendously."

The minimum run time is between 10 and 15 minutes, which users can adjust, depending on whether they want to maximize their throughput or read length.

An entire experiment, "from the time that you start sample prep to when you have your data," can be completed in less than 12 hours, Martin said, adding that the company plans to cut this time to four to five hours with optimizations. Sample prep costs "will be comparable or lower than with other technologies," he said, and the company has an ongoing collaboration to automate its sample prep protocols.

Users will also be able to load the instrument with batches of up to 96 SMRT cells, which the instrument can run unattended. Each of these cells can run a different protocol, such as standard sequencing, circular consensus sequencing (see In Sequence 10/14/2008), where a circular substrate is sequenced several times over; or strobe sequencing, where each read is broken up into pieces with "dark" intervals (see In Sequence 5/12/2009).

PacBio Reveals Commercial Specs; Initial Focus on Long Reads, Short Runs, Low Experiment Cost
November 17, 2009
Type size:
-
+EmailPrinter-friendly versionRSS FeedView on one page
The order of cells is determined by a built-in scheduler that optimizes the throughput. Sample prep can be "easily" multiplexed for each protocol, according to Martin.

The average read length will initially be "at or beyond" that of the 700 to 1,000 or so bases of Sanger sequencing, "and will continue to increase rapidly over time," according to Martin.

Initially, the system's DNA polymerase will add nucleotides at a speed of between one and three bases per second, which the instrument will record in real time. "This is probably the single biggest discriminator between second-gen and third-gen sequencing — the base-to-base speed," Martin said, noting that existing sequencing platforms require lengthy wash cycles between each base incorporation.

The system's software pipeline has two components. The primary analysis, which is performed in real time on a compute module that comes with the instrument, provides base calls with quality values and a vector that represents temporal and contextual information around the base call. "We have designed significant headroom in that compute infrastructure so that over time, we can dramatically increase the throughput of the machine and continually meet the objective of data in real time," Martin said.

The company will encourage third-party developers to work with the data and develop tools that improve the initial output. To that end, it will organize a developer conference at next year's AGBT meeting, where it will share information about its software and "make sure the developers understand the various opportunities and how they can work with us to build innovation on top of the platform," Martin said

"I think there are a lot of opportunities for third parties to come in and build tools that will change different characteristics of our products" — for example to increase the accuracy of the data — "and we want to encourage that," he added,.

The secondary analysis — assembling or mapping the data — is performed off the instrument. PacBio will not sell the required computing equipment, all "standard off-the-shelf blade servers and disk arrays," according to Martin, but will provide recommendations and requirements.

Martin declined to provide an estimate of the throughput of the first commercial instrument yet, saying that it is "dependent on a number of variables which are not finalized."

Assuming that a third of the zero-mode waveguides are loaded with one enzyme that remains active over a 15-minute run, the output per SMRT cell run could theoretically reach up to 72 megabases if two sets run sequentially, provided the polymerase proceeds at 3 bases per second. This would translate to a throughput of up to 140 megabases per hour.

By comparison, Illumina's Genome Analyzer provides up to 33 gigabases of high-quality data in a 9.5-day run, according to specifications on the company's website, translating to approximately 140 megabases per hour as well. The company recently said that several customers have achieved more than 55 gigabases per run and that it is targeting a throughout of 95 gigabases per run by around the end of the year (see In Sequence 11/3/2009).

PacBio Reveals Commercial Specs; Initial Focus on Long Reads, Short Runs, Low Experiment Cost
November 17, 2009
Type size:
-
+EmailPrinter-friendly versionRSS FeedView on one page

While several competing sequencing platforms are focusing on large numbers of reads and low cost per base, PacBio is initially emphasizing the comparatively long read lengths, short run time, low cost per experiment, and different applications that its single-molecule analysis system will provide.




所有跟贴:


加跟贴

笔名: 密码: 注册笔名请按这里

标题:

内容: (BBCode使用说明