As one of the inventors of next-generation DNA sequencing, Sir Shankar Balasubramanian could claim to be responsible for a revolution in the life sciences.
Balasubramanian and chemist David Klenerman, both at the University of Cambridge, founded the start-up Solexa in 1998; in 2006, they released their first commercial genome analyzer, able to sequence billions of DNA bases in parallel by detecting color-coded nucleotides as they are added to growing DNA strands. Solexa was acquired by Illumina, which as of last year had an 80% share of the global DNA sequencing market. In 2012, Balasubramanian launched Cambridge Epigenetix, a company developing technologies that sense DNA methylation and other modifications of DNA’s four nucleotide bases during sequencing.
He has stayed in academia and continued working on many of the fundamental questions related to the structure of DNA molecules. One of his main areas of study is G-quadruplexes, which consist of four guanine bases often on a single DNA strand that come together and cause DNA to loop back on itself twice. His group has developed methods to detect the hundreds of thousands of G-quadruplexes in human cells and is now investigating their role in regulating transcription and shaping cell programming.
Rachel Brazil spoke to Balasubramanian about his early work on next-generation sequencing and his continued fundamental work on DNA structure and function. This interview was edited for length and clarity.
▸ Hometown: Born in Madras (now Chennai), South India, and grew up in Runcorn, England
▸ Education: BA, natural sciences, and PhD, chemistry, University of Cambridge
▸ Interests outside science: Running, cycling, and wine
▸ Favorite book:A Fine Balance, by Rohinton Mistry
▸ Favorite place to be when not at home: Somewhere with warm sunshine, rolling hills, and vineyards
▸ Ambition not yet fulfilled: Climb a mountain above 6,000 m or play football for Liverpool FC. “Either option would do it!”
▸ The secret to running a successful research group: “Find the right balance between providing clear guidance and allowing room for exploration, [and] celebrate the efforts, achievements, and successes of everyone.”
What were the biggest challenges you faced in developing your sequencing method, which ultimately led to Illumina’s sequencing system?
If you ask each type of scientist, they’ll focus on their particular area and point out the challenges. For me, the part I had the most to do with was the polymerase biochemistry and the nucleotide chemistry.
But I would say perhaps the greatest challenge was bringing all the components together into an integrated system: going from the science to building a commercial system that has the accuracy and the cost characteristics to really democratize sequencing. We’re chemists, we’re not geneticists; we wanted to put this in the hands of people who could do interesting things with it.
What other tools have you been developing to study genetic information?
Information is encrypted in DNA in many different ways. The sequence of the four genetic bases [commonly represented by the letters C, G, A, and T] is one aspect, but it’s not the only one.
The other area, of course, is natural epigenetic modifications of DNA, such as methylation. The old way of identifying methylation sites uses the chemical reagent bisulfite. It skips over the methylated cytosines but converts unmethylated cytosines to uracils—which are actually read as thymines. That’s not great, because from a sequencing perspective, it converts your genetic alphabet to three letters rather than four. You lose genetic information to gain the epigenetic information.
What’s needed is a sequencing method that directly reads more than four letters of the DNA alphabet. You have to come up with a clever system of converting the identity of bases in a way that depends on whether they’re modified or not. There are technologies coming out, including from my company, Cambridge Epigenetix. We have methods for sequencing five letters simultaneously, and six letters will be coming later on.
Do you think genetic research will move beyond simple base pairs and focus on larger DNA structures, like G-quadruplexes?
I think we should keep an open mind to what some people call alternative DNA structures, and the G-quadruplex story is an example. We actually started the G-quadruplex work around the same time as Solexa sequencing was up and running, but I think it’s taken longer to see more of the picture.
Perhaps what’s different now is that we’ve created the tools to access these questions in DNA, in chromatin, in cells. A small lab can do what the whole world couldn’t do 20 years ago.
What are you discovering about the role of G-quadruplexes in controlling cell gene expression?
Something we wanted to understand was where G-quadruplexes form in cells. The ones that we could actually detect in chromatin were heavily enriched in regulatory regions, in particular the region upstream of transcription start sites (Nat. Genet. 2016, DOI: 10.1038/ng.3662). This fit our hypothesis that these structures are associated with transcription somehow.
The most recent work we’ve done is at the single-cell level. We’ve been able to infer the identity of a cell only by profiling its G-quadruplexes, which fits my view that G-quadruplexes are a marker of cellular identity. As human stem cells differentiate into two different lineages, we are able to see the changes in quadruplexes and chromatin in key genes (Nat. Commun. 2022, DOI:10.1038/s41467-021-27719-1).
The next chapter is really about trying to establish details of the quadruplexes’ functions and map out pathways and protein interactions. And this still leaves open the holy grail question as to exactly what controls the formation of quadruplexes. We don’t know that yet; that’s a work in progress.
You’re also interested in how DNA structure relates to cancer. Could understanding G-quadruplexes provide routes to new cancer therapeutics?
I think of cancer cells as being different states of healthy cells after a reprogramming of their functions. In 2014, we found that in liver [and stomach] cancers the cancer tissue had a much higher density of quadruplexes than a noncancerous tissue (PLOS One 2014, DOI: 10.1371/journal.pone.0102711). This suggested that there was a link between G-quadruplexes and the sort of functional state of the cell.
Perturbing the epigenome is a very serious strategy being deployed for cancer, and it’s usually through targeting the epigenetic machinery—histone deacetylases or DNA methyltransferases. I think G-quadruplexes may provide another mechanism. Many drugs work by interacting with DNA indiscriminately, but targeting quadruplexes is not indiscriminate—you’re targeting a defined feature.
So I think there’s more work to be done here to understand the mechanism. Twenty years ago, I wasn’t sure that these structures even existed in living systems.
Rachel Brazil is a freelance writer based in London. A version of this story first appeared in ACS Central Science: cenm.ag/shankar.