Chemistry matters. Join us to get the news you need.

If you have an ACS member number, please enter it here so we can link this account to your membership. (optional)

ACS values your privacy. By submitting your information, you are gaining access to C&EN and subscribing to our weekly newsletter. We use the information you provide to make your reading experience better, and we will never sell your data to third party members.


Analytical Chemistry

The $1,000 Genome

New methods aim to drive cost of sequencing an individual human genome to below $1,000

by Celia Henry Arnaud
December 19, 2005 | APPEARED IN VOLUME 83, ISSUE 51

Flashing Lights
Credit: Courtesy of VisiGen Biotechnologies
VisiGen's sequencing technology incorporates a donor fluorophore on the DNA polymerase and a different acceptor fluorophore on each nucleotide.
Credit: Courtesy of VisiGen Biotechnologies
VisiGen's sequencing technology incorporates a donor fluorophore on the DNA polymerase and a different acceptor fluorophore on each nucleotide.

The Human Genome Project is long done, and the entire genome is sequenced. We know the order of the adenines, guanines, cytosines, and thymines. End of story, right? Wrong. It's only the beginning. The challenge now is to find a way to integrate genomic information into health care.

Sequencing is still prohibitively expensive. The Human Genome Project spent billions of dollars to sequence a single genome. The price has fallen significantly since then to about $10 million for a genome the size of a human's. But even that fraction of the original amount is way too expensive to make genome sequencing practical for individual medical decisions.

But what if it cost only $1,000 to sequence an individual human genome? Suddenly, sequencing every person's genome would be within reach.

That's just what could happen if the projects funded by the Revolutionary Genome Sequencing Technologies program succeed. Earlier this year, the National Human Genome Research Institute, part of the National Institutes of Health, awarded nine grants totaling more than $25 million, each with the goal of reducing the cost of sequencing a genome to $1,000 or less.

One grant recipient, Hagan Bayley of the University of Oxford, credits NIH with taking the risk to fund speculative projects. "For a relatively modest amount of money, it brings together several excellent groups to carry out research that is quite a bit riskier than usual," he says. "In this particular case, the payoff in medicine will be huge."

The sequencing technology from Houston-based VisiGen Biotechnologies is probably the closest to bearing fruit. Susan Hardin, company chief executive officer, believes that a number of features are needed to drive down the cost of DNA sequencing, including a single-molecule approach, massively parallel arrays, and real-time detection. VisiGen is working to incorporate all three in its technique, which uses the enzyme DNA polymerase and the nucleotides themselves to identify the bases directly as a complementary DNA strand is synthesized.

VisiGen exploits DNA replication by putting a donor fluorophore on the enzyme and a different color acceptor fluorophore on each of the four types of nucleotides. When a nucleotide is incorporated into the growing DNA strand, the attached acceptor fluorophore gives off light, the color of which identifies the base.

The company is on track to launch a commercial sequencing service based on the technology by the end of 2007, according to Hardin. So far, they've been able to sequence DNA 28 bases at a time. The assembly of the data into longer sequences will become more straightforward as they are able to work with longer DNA strands.

Another project is focusing on microfluidic handling of the sequencing assay. A team at Duke University, Stanford University, and Advanced Liquid Logic, headed by Duke electrical engineering professor Richard B. Fair, uses voltage control to manipulate water droplets on a hydrophobic surface. The water droplets serve as individually addressable reaction vessels for pyrosequencing, a method whereby incorporation of a nucleotide into the growing complementary DNA chain results in the release of pyrophosphate, which goes through an enzyme cascade and generates visible light.

To do the sequencing, the team immobilizes the DNA on a substrate and then uses the droplets to present nucleotides to the DNA. By collecting the pyrophosphate in the droplets and doing the detection elsewhere, the researchers speed up the process by continuing the nucleotide incorporation while the pyrophosphate reactions run to completion somewhere else.

Right now, pyrosequencing is limited to lengths of about 100 bases. Fair and his colleagues are working to push that limit to 350 and more. Once they demonstrate that all the reagents and reaction products are compatible with their voltage-driven method of manipulating the droplets, they plan to do proof-of-principle experiments and then send the technology to the genome-sequencing center at Stanford University to sequence larger pieces of DNA.

Several of the projects focus on more speculative technologies such as nanopores. Nanopores reduce cost by speeding up the process and by eliminating the need for DNA amplification and for expensive reagents such as fluorescent nucleotides. One of the challenges with nanopores is how to differentiate the bases. Each project is putting its own twist on meeting that challenge.

Credit: Courtesy of Aleksei Aksimentiev, Eduardo Cruz-Chu, Gregory Timp, and Klaus Schulten
Credit: Courtesy of Aleksei Aksimentiev, Eduardo Cruz-Chu, Gregory Timp, and Klaus Schulten

In one version, biophysicist Aleksei Aksimentiev, electrical engineer Gregory Timp, and coworkers at the University of Illinois, Urbana-Champaign, are drilling synthetic inorganic nanopores sized to fit single nucleotides through multilayered silicon structures in which two semiconducting plates are separated by a dielectric layer. DNA passing through the pore induces an electrical signal at the semiconducting plates. The plan is to distinguish the bases by their dipole moments.

The problem is that the DNA passes through the pore very quickly, Aksimentiev says. "We somehow have to trap the molecule in the pore," he says.

Jingyue Ju of Columbia University is also using synthetic inorganic nanopores. His detection method relies on the fact that the nucleotides block the current flowing through the nanopore. He hopes to make the amplitude and duration of that blockage different for each nucleotide.

To accentuate the differences among the nucleotides, Ju will chemically modify them. Once the DNA bases are distinguishable, he still faces the challenge of slowing down the movement of the DNA through the nanopore for detection.

Despite the sample prep involved with Ju's method, he thinks it will significantly hasten DNA sequencing. "The nanopores provide a simple method to detect DNA at the single-molecule level without introducing any separation steps," he says. Ju believes that it will be possible to detect DNA with single-base-pair resolution with nanopores within five years. Distinguishing repeating stretches of DNA will take longer.

Not all of the nanopore projects are based on synthetic nanopores. Oxford's Bayley and Reza Ghadiri at Scripps Research Institute are developing protein nanopores for DNA sequencing based on the protein hemolysin. The protein is embedded in a membrane, and the current that flows through the pore changes as a base passes through.

Just as with synthetic nanopores, hemolysin-based nanopores must be tweaked to slow down the DNA. "We want to reach some compromise between it just speeding through like a blur, like a car shooting by that you can't identify, and it just sitting there," Bayley says.

They want to place a constriction or recognition site in the pore to help slow down the DNA. "That recognition [site] might be as simple as a physical constriction on the pore," Bayley says, "or it might be as sophisticated as attaching a few bases to the pore so that the DNA sticks and hops along as it moves through." They also are thinking about using enzymes to thread the DNA through at a controlled rate or increasing the viscosity inside the pore.

Bayley imagines using thousands of pores in parallel, probably sequencing at the rate of a base per millisecond. The impact will be even greater if they can sequence lengths of DNA of thousands or even hundreds of thousands of bases.

Credit: Courtesy of Yann Astier
Credit: Courtesy of Yann Astier

While other researchers are pushing innovation, Bhubaneswar (Bud) Mishra's goal is to avoid novelty as much as possible. The New York University computer scientist's goal is "to keep it as simple as possible." He approaches the challenge as an engineering project rather than novel science. "I don't want to use any phenomenon that's not well-understood or well-characterized," he says. "The idea is to build on things that are true and tried."

Mishra is combining existing technologies and using calculations to see just how accurate they have to be to drive the cost down. After considering a number of techniques, Mishra's team decided to use the methods of optical mapping and sequencing by hybridization because they seemed most likely to be most cost-effective. In optical mapping, individual DNA molecules are stretched on a surface, cut with restriction enzymes, labeled with fluorophores, and imaged. In sequencing by hybridization, probes with known sequences that are six to eight nucleotides long hybridize to the DNA being sequenced, thereby revealing the sequences that are in the DNA. Optical mapping places markers along the genome, thus breaking the problem into manageable chunks that can then be sequenced by hybridization.

Experimentation is combined with statistics to solve what is essentially an optimization problem: What combination of parameters will yield the least number of experiments, and therefore the lowest cost, to get an accurate sequence?

The more replicates one can run per genome, the greater the certainty that the sequence determined is accurate. If the required number of replicates is too large-say, 1 billion-then the technique is not useful, Mishra says.

What Mishra's team discovered in combining the techniques is that they obey what Mishra calls a "computational phase transition." If the parameters are below a certain level, the probability of getting an accurate sequence is zero, but if they are above that level, the probability jumps to one. "Whether anyone can make a correct map or not very sensitively depends on those parameter values," he says. "It's all or nothing."

Only time will tell whether any or all of these methods will deliver the $1,000 genome. The indications are that they should work. As Bayley says, "There's nothing fundamentally against the laws of physics here."



This article has been sent to the following recipient:

Leave A Comment

*Required to comment