If you have an ACS member number, please enter it here so we can link this account to your membership. (optional)

ACS values your privacy. By submitting your information, you are gaining access to C&EN and subscribing to our weekly newsletter. We use the information you provide to make your reading experience better, and we will never sell your data to third party members.


Biological Chemistry

Analyzing Protein Drugs

Scientists devise new, streamlined methods to understand the complexity of biopharmaceuticals

by Jyllian N. Kemsley
July 20, 2009 | A version of this story appeared in Volume 87, Issue 29

Credit: Merck & Co.
Merck's bioprocessing facility aims to produce proteins with human glycosylation patterns by expressing them in genetically engineered yeast.
Credit: Merck & Co.
Merck's bioprocessing facility aims to produce proteins with human glycosylation patterns by expressing them in genetically engineered yeast.

Protein drugs are increasingly a key component of modern medical care. According to IMS Health, four of the top 15 U.S. pharmaceutical products by sales in 2008 were protein drugs: Enbrel and Remicade, which block a protein involved in systemic inflammation and are used to treat immune diseases such as arthritis and psoriasis; Neulasta, which promotes white blood cell production and is used to combat immunosuppression in cancer patients undergoing chemotherapy; and Epogen, which stimulates red blood cell production and is used to treat anemia in patients on dialysis for chronic renal failure.

Overall, revenues of publicly traded biotech firms, which were the first to create biologic drugs for the market, were $90 billion in 2008, according to Ernst & Young's annual report on the industry (C&EN, May 11, page 9).

Protein drugs can be extraordinarily expensive, with some treatments costing $100,000 or more per year. In contrast to small-molecule drugs, however, there is currently no regulatory pathway for the Food & Drug Administration to approve generic versions—often called follow-on biologics or biosimilars—of brand-name biologic drugs. Bills creating a regulatory process for biosimilars are currently the subject of intense debate in Congress (C&EN, April 6, page 23).

A key consideration in the discussion is that the drugs in question are inherently complex. As new treatments emerge and biosimilars are evaluated, therefore, the need to better understand biologics is more acute than ever.

To study how protein drugs work, one has to consider not just their primary amino acid sequence but also complications such as how they are processed and folded by cellular machinery and whether they remain single entities (monomers) or associate into multimers. Furthermore, biologic drugs are produced in living cells, and different cell types or strains or processing conditions may yield different protein modification patterns, as well as different impurities.

Credit: Courtesy of Barry Karger
The electron transfer dissociation (ETD) fragmentation technique clips proteins along the peptide backbone, whereas collision-induced dissociation (CID) generally targets weaker bonds.
Credit: Courtesy of Barry Karger
The electron transfer dissociation (ETD) fragmentation technique clips proteins along the peptide backbone, whereas collision-induced dissociation (CID) generally targets weaker bonds.

As a result, in stark contrast to the homogeneity of small-molecule drugs, heterogeneity is an inherent part of the biologic package, says Andrew Fox, Amgen's director of regulatory affairs. Whether it is a protein associating into variable dimer:trimer ratios or one with an established array of sugars dotting its surface, a biologic drug is a heterogeneous mixture of related molecules rather than a pure compound, he adds.

This distinction is well understood by biotechnology companies and regulators but can be a shock to someone coming from a small-molecule background, Fox says. And it adds up to a big analytical challenge for those charged with controlling a drug's properties.

Studying biologics takes advanced scientific tools and approaches. Although protein drug experts say they have many good tools at their disposal to tackle this challenge, they are pushing to make those tools even better—by adapting them for automation and higher throughput, enhancing them to screen potential drug candidates and track drugs in patients more effectively, and improving their ability to control manufacturing quality.

To better understand biologic drugs, R&D labs might use upward of 20 to 30 analytical methods to study the characteristics of a protein drug target: what its amino acid sequence is, whether it gets spliced or truncated, how it folds, how many disulfide bonds between amino acids might be holding it together, whether it associates into multimers or aggregates into insoluble clumps, how the protein has been modified after translation from RNA, how stable it is, how active it is in binding to a target or catalyzing a reaction, and so on.

The list of methods used to answer these questions includes Edman sequencing, in which N-terminal amino acids of a protein are sequentially labeled, cleaved, and identified; gel electrophoresis; immunoassays; analytical centrifugation; mass spectrometry (MS); ultraviolet, fluorescence, circular dichroism, and nuclear magnetic resonance spectroscopies; high-performance liquid chromatography (HPLC); and capillary electrophoresis (CE). Scientists then piece together the results to get as close as possible to a full picture of a drug's composition and activity. In contrast, one or two methods might tell researchers most of what they need to know about the structure of a small-molecule drug.

Once companies have a good understanding of their protein, they whittle down the array of analysis tools to 12 to 15 critical methods that will then be used routinely for manufacturing. Which methods get used, however, depends on the characteristics of the protein in question. "You're not simply taking a cookie-cutter approach," Fox says. Although one might always test for pH, other tests "will very much vary, based on the nature of the molecule and the quality attributes that are critical from a clinical perspective," he says.

A major focus of biologic drug developers has been understanding and controlling protein glycosylation. Glycosylation generally involves the addition of polysaccharide groups, or glycans, to multiple locations on proteins. It is an enzyme-directed, site-specific process that modulates the structure, stability, and function of proteins.

Glycosylation is sensitive to the particular cells and conditions in which a protein is produced, and it does not naturally occur in bacteria or yeast. It not only influences the activity and stability of a protein but can also play a role in a patient's immune response to a biologic drug. Understanding and controlling glycosylation are, therefore, critical to manufacturing biologics.

MS and CE are two common techniques for analyzing glycan groups. Historically, however, they have not been straightforward to implement and have not had good reproducibility—except perhaps when put into very well trained, and thus very expensive, hands. Pauline Rudd, a professor of glycobiology at University College, Dublin, and her colleagues at Ireland's National Institute for Bioprocessing Research & Training set out to help address that problem by developing a glycan analysis method using HPLC that requires minimal training and can in large part be automated.

The method starts with immobilizing a target protein in 96-well plates (Anal. Biochem. 2008, 376, 1). The protein might be cut out of a gel, isolated from a manufacturing stream, or obtained from a blood serum sample. Sugars are enzymatically cleaved from the protein; labeled with a fluorescent tag, 2-aminobenzamide; and analyzed by quantitative HPLC. Each initial peak is assigned a "glucose unit value" (GUV), and computer software matches the GUV to possible known structures. Further aliquots of the glycan samples are then enzymatically digested to determine detailed glycan structures.

Not only is the approach automatable, but it can detect sugars down to the femtomole level—a key advantage when looking at protein glycosylation, because carbohydrate modifications are highly varied and some glycans may occur only in small amounts. Erythropoietin, for example, has on the order of 100 different glycan variations. "You have to make sure you get complete recovery" of the glycans being analyzed, Rudd emphasizes, or you could miss key information. Rudd is collaborating with equipment manufacturers to improve resolution, miniaturize components, and speed up sample analysis further.

Other researchers are trying to adapt MS techniques to make them more useful in quality-control settings. One example involves coupling MS to electron transfer dissociation (ETD), a technique that uses large aromatic anions, such as fluoranthene, to fragment proteins into peptides. ETD differs from traditional MS fragmentation techniques such as collision-induced dissociation (CID) in that ETD specifically fragments the protein backbone, whereas CID typically breaks up the protein at its weakest points, which can include bonds to glycans or phosphates that have been added to the protein, as well as a few backbone bonds.

Because ETD starts fragmenting a protein from either the N- or C-terminus, depending on conditions, it can be used to sequence proteins. "Depending on the size of the protein, you may be able to sequence up to 50% of it in a matter of a minute," says Andreas Hühmer, director of proteomics marketing at Thermo Fisher Scientific. Researchers can thus use the technique as a relatively quick way (compared with traditional sequencing) to ensure they're correctly tracking a particular protein.

Credit: Anal. Chem.
The rate at which a protein exchanges hydrogen for deuterium can be used to evaluate the conformation of immunoglobulin G. The structure shown is a model derived from crystal structures of the antibody's antigen-binding regions (top) and its constant region (bottom).
Credit: Anal. Chem.
The rate at which a protein exchanges hydrogen for deuterium can be used to evaluate the conformation of immunoglobulin G. The structure shown is a model derived from crystal structures of the antibody's antigen-binding regions (top) and its constant region (bottom).

Microfluidics is also being used increasingly to aid the development and manufacturing of biologics and other drugs, says Barry L. Karger, a professor of chemistry and director of the Barnett Institute at Northeastern University. He points to the use of microfluidic chips to replace traditional LC separations at the front end of MS as one example. "With microfluidic chips, you have good reproducibility from run to run and chip to chip," and the process is fairly simple and convenient, Karger says. "The more automation you can have, and the less human interpretation and operation, the better."

Last fall, the Barnett Institute launched the Center for Advanced Regulatory Analysis (CARA), which aims to help transfer analytical techniques from the basic research stage to the drug development arena. One CARA project has involved a partnership with Momenta Pharmaceuticals to use CE/MS to look at intact glycoforms of proteins. Other glycosylation analysis techniques typically involve cleaving glycans from proteins, yielding the different forms, "but you still haven't put the molecule together," Karger says. Using CE/MS to locate the glycans at specific sites on the protein gives another level of information, he says.

Another technique that Karger, Barnett Institute research professor Shiaw-Lin (Billy) Wu, and colleagues are working on is using MS with a combination of both ETD and CID to identify disulfide bridges in proteins (Anal. Chem. 2009, 81, 112). Disulfide bridges, which link the thiol groups of cysteine residues, play a critical role in the folding and stability of proteins. "Disulfide scrambling" typically means an incorrectly folded, inactive molecule. Putting CID and ETD together in a combined LC/MS system reduces the need to run multiple experiments to analyze disulfides, Karger says. The method even identifies the interactions within tissue plasminogen activator (TPA), a clot-dissolving enzyme that folds with 17 disulfide bonds. Complicated proteins, such as TPA, still require manual interpretation of the data, the researchers say, but analysis of simpler ones—for example, human growth hormone, which has only two disulfide linkages—can be automated.

John R. Engen, a chemistry professor at Northeastern and a fellow at the Barnett Institute, is using MS to tackle questions about protein structure. He and his coworkers take advantage of hydrogen/deuterium exchange reactions to illuminate overall protein conformation and dynamics. By using MS to monitor the exchange of protein backbone amide hydrogens with deuterium in solution, they recently studied how glycosylation of the antibody immunoglobulin G might affect the way it interacts with its receptors (Anal. Chem. 2009, 81, 2644). The approach uses only picomoles of material. Engen and colleagues envision that, with appropriate software, the technique could be used as a screening tool to evaluate how protein modifications affect activity. A collaboration between Engen and Waters Corp. aims to make such instrumentation commonplace (Anal. Chem. 2008, 80, 6815).

With all of the data that biologic drug researchers have in hand, two important issues remain outstanding: What key properties are scientists currently missing in their analyses of biologics, and what is the clinical significance of all of the analytical data they currently do obtain?

The first question stems directly from the heparin adulteration that rocked the pharmaceutical industry last year (C&EN, May 12, 2008, page 38). Even as drug companies push to streamline and automate, they must also have tools with high information content, Karger says. "You have to think in terms of coming up with methods that in principle find things that you wouldn't think would be there."

The significance of analytical data can play out in several ways. In one example, Genzyme ran into trouble last year when it tried to scale up manufacturing of alglucosidase alfa (Myozyme), an enzyme to treat Pompe disease, a rare inherited metabolic disorder (C&EN, April 28, 2008, page 25). When the company applied to FDA to open a 2,000-L reactor to supplement its existing 160-L process, the agency denied the application, stating that enzyme produced in the larger reactor had a different glycosylation profile and had to be treated as a different drug. Genzyme is now trying to get material from the 2,000-L facility approved as Lumizyme.


If something as simple as a change in reactor size can trip up one company, the hurdles get much higher for competitive companies trying to get biosimilars approved. Typically, "generic" biologics will be produced in different cell lines than first-generation biologics, with different growth media and processes, leading to alterations in impurity profiles, distinctive glycosylation patterns, and other variations relative to the original product. A collaboration between William S. Hancock, a Northeastern chemistry professor and Barnett Institute fellow, and Genentech recently demonstrated that glycan structures can serve as a molecular fingerprint to identify exactly who manufactured a specific protein.

A major issue for biologic drug developers today is how to tell whether something that is analytically significant is also clinically significant, says David Robinson, vice president of bioprocess research and development at Merck & Co.—which will be trying to emulate human protein glycosylation patterns in proteins produced by genetically engineered yeast (C&EN, Jan. 12, page 28).

Consequently, as analytical technology advances, therefore, a big challenge for companies producing biologic drugs is to close the gap between analytical data and clinical knowledge. Nobody ever said that analyzing biologic drugs was going to be easy. But as analytical technology continues to advance, the goal of better understanding the workings of biologics is coming ever closer to realization.


This article has been sent to the following recipient:

Chemistry matters. Join us to get the news you need.