Issue Date: July 25, 2011
A search for scholarly articles with the word “biomarkers” can yield more than half-a-million hits. Such a search may not say much about the content of those papers, but it does suggest that scientists are putting a lot of effort into discovering molecules that can be used to diagnose diseases, preferably in easily accessible fluids such as blood or urine. Few candidate biomarkers progress beyond that initial discovery, however. And even those that proceed beyond that initial phase often disappoint.
A prime example came earlier this year, when the Early Detection Research Network (EDRN), a program run by the National Cancer Institute, released its evaluation of the performance of 35 previously reported markers of ovarian cancer (Cancer Prev. Res., DOI: 10.1158/1940-6207.CAPR-10-0195). Many of these markers claimed to be better than CA125 (cancer antigen 125), one of only two markers approved for monitoring ovarian cancer. When the researchers measured the markers in 180 disease and 660 control specimens obtained as part of NCI’s Prostate, Lung, Colorectal & Ovarian Cancer Screening Trial, none of the new markers was better than CA125, after all.
“It was very disappointing,” says Eleftherios P. Diamandis, director of the Advanced Center for the Detection of Cancer at Mount Sinai Hospital, in Toronto, who participated in conducting the trial. “We went back to square one.”
The starting gun that sent the biomarker field galloping was a study published nearly a decade ago. In 2002, a team led by Emanuel F. Petricoin III, then with the Food & Drug Administration, and Lance A. Liotta, then at NCI (both now at George Mason University), reported a panel of protein markers in serum that they claimed could diagnose early-stage ovarian cancer with 100% sensitivity and 95% specificity (Lancet, DOI: 10.1016/S0140-6736(02)07746-2). Sensitivity measures the percentage of patients with a disease who actually test positive for it; specificity measures the percentage of people without disease who actually test negative. In The Lancet paper, for example, the test had a 5% false-positive rate. The paper did not identify particular proteins but instead used pattern recognition of mass spectral peaks to differentiate between patients with and without ovarian cancer.
That paper convinced many people that such markers were possible and set off a race to find candidate biomarkers. But the horses seem to have trouble making it to the finish line as fully validated markers. More often than not, they don’t even leave the starting gate, or they fall out of contention on the backstretch.
“I consider that article like Helen of Troy. It’s the face that launched a thousand ships,” says David F. Ransohoff, a cancer researcher at the University of North Carolina, Chapel Hill. “After that article, many labs invested in the technology, and National Institutes of Health started funding initiatives about proteomics markers. There was so much promise on the basis of that article.”
After the initial euphoria surrounding that article, the ability to translate the findings to the bedside was greatly hampered by the fact that the identities of the ovarian cancer markers were not known and by the poor overall performance of the mass spectrometry instrumentation Petricoin and Liotta used, which was not developed for clinical diagnostic assays. In addition, critics trolled through data Petricoin and Liotta had published on the Web (not part of The Lancet study) and found problems with sample collection, batch effects, and bioinformatics artifacts. Petricoin and Liotta maintain that the critics’ analyses used the data in ways for which their experiments were not designed.
That study’s problems turned out to be far from unique though. Many studies looking for proteomics-based biomarkers suffer from various forms of bias in terms of sample collection, analytical assays, and data manipulation and interpretation, Diamandis says. These biases, which many people don’t recognize, lead to false discoveries—putative biomarkers that can’t withstand further testing.
“It’s very easy to get specimens where there are systematic biases,” meaning differences among samples from different groups of clinical-trial participants that don’t represent cancer versus noncancer distinctions, Ransohoff says. He points to a prostate cancer study that used patient samples from 67-year-old men and control samples from individuals who averaged 38 years old, 58% of whom were women (J. Clin. Invest., DOI: 10.1172/JCI26022). This is only one example of a study where seemingly strong discrimination might have resulted from something other than the targeted disease.
“The bias can be hardwired into the specimens before they ever get to your lab,” Ransohoff says. “Addressing that kind of bias can fall through the cracks, depending on who’s leading the research and how much they involve experts in clinical research methods.”
“You have to be extremely careful in how you collect material,” says Steven A. Carr, a protein biochemist at the Broad Institute, in Cambridge, Mass. “A lot of biomarker studies are being done with retrospective collections because that’s the material that happens to be available.” A potential problem in such cases is that samples collected at different sites using different collection and processing protocols can lead to signals due to something other than the underlying disease.
Other samples are not as well characterized as they should be. “In early discovery, people use ‘convenience’ samples,” Ransohoff says. “They’re specimens from somebody’s freezer, where they don’t even know how the specimens were collected and stored.”
Other problems can arise from doing biomarker discovery in cultured cells. Cells don’t act in cell culture as they do in people, says Sudhir Srivastava, head of EDRN. When such cell cultures are grown from late-stage tumors they may secrete proteins that are released into the blood in only tiny amounts, if at all, by early-stage tumors. “We need to look for biomarkers in the early stages of disease,” Srivastava says.
The cell culture problem is compounded when the cells are from animals such as mice instead of people. “We need to develop biomarkers in human samples and test them in human samples,” Srivastava says.
But even good samples might not be enough to save some biomarkers. Many markers fail because of “interindividual variability,” Carr says. “Even if you have good samples, what happens in a small, carefully curated set of materials may not hold up when you go to a more population-based screen.”
Srivastava blames much of the failure to find and validate biomarkers on a lack of knowledge about the underlying biology. To develop biomarkers that permit early detection of disease it’s important to understand the underlying disease mechanisms that produce those biomarkers, he says.
For example, Srivastava believes a greater understanding of cancer pathways and networks will accelerate the discovery of cancer biomarkers for early diagnosis. Nevertheless, conventional methods of biomarker discovery must continue as we move toward that understanding, he says.
However, Diamandis dismisses the idea that biological understanding is necessary to find and use biomarkers. He points out, that clinicians, for example, have used prostate-specific antigen and CA125 for years as diagnostic markers for prostate cancer and ovarian cancer, respectively, although biologists still aren’t sure about the biological roles they play. “It doesn’t matter what they are doing if they are good markers,” Diamandis says.
An additional challenge is that it can be difficult and expensive to fully validate the sensitivity and selectivity of promising biomarkers. A full prospective validation study can cost millions of dollars, and the sheer number of possible markers can make it difficult to prioritize them.
Validation is not as “glamorous” as discovery, Srivastava says, and this makes academic funding for validation studies hard to obtain in the peer review process. So diagnostic companies are more likely than academic researchers to undertake final validation studies leading to clinical use. But before companies will make that investment, they need to know that a marker has a good shot at success.
“The very first thing a diagnostic company will say is, ‘Show me your data,’” Diamandis says. “If you show them the data and it’s a sloppy or superficial discovery exercise, the diagnostic company will say, ‘I’m sorry, I’m not convinced. Why don’t you do some more work and then come back to us?’ ”
EDRN is trying to circumvent some of the obstacles to good biomarker discovery and validation. In its guidelines for biomarker development, it encourages researchers to discover new markers by using samples collected prospectively from groups of patients who could develop specific diseases, instead of convenience samples. In addition, it provides researchers with the opportunity to further test promising biomarkers using its collection of high-quality reference specimens.
Researchers agree that perhaps the best way to improve biomarker discovery and validation is to encourage collaboration between basic scientists and clinicians from the outset. That way, people who develop technology, those who study the underlying biology, and those who treat patients can learn from each other.
Such collaborations might help researchers avoid the pitfall of trying to discover markers in isolation from the intended clinical use. Many markers fail to reach the clinical stage because the intended use isn’t defined up front, says Daniel W. Chan, director of the Center for Biomarker Discovery & Translation at Johns Hopkins Medical Institutions. Therefore, the markers don’t work for the diagnostic tests the researchers try to develop. “Most researchers don’t appreciate the need for defining this critical term in the beginning. We should define the clinical intended use before we set out to discover biomarkers.”
Despite the disappointments, there have been some bright spots for proteomics-based biomarkers. In 2009, FDA approved a panel of protein biomarkers, marketed under the name OVA1, to distinguish between malignant and benign ovarian masses prior to surgery. Another blood test, called VeriStrat, is used to determine whether patients with non-small-cell lung cancer will respond to an endothelial growth factor receptor inhibitor.
Those successes notwithstanding, some question high expectations for the future of biomarkers. “I’ve frequently said that one should not expect the success rate for biomarkers to be any higher than it is for the entire discovery through clinical implementation of a drug,” Carr says. “The failure rate is probably about the same.”
Nevertheless, many hope that continuing efforts in biomarker discovery and development will eventually pay off with new tests that improve disease diagnosis and human health.
- Chemical & Engineering News
- ISSN 0009-2347
- Copyright © American Chemical Society