If you have an ACS member number, please enter it here so we can link this account to your membership. (optional)

ACS values your privacy. By submitting your information, you are gaining access to C&EN and subscribing to our weekly newsletter. We use the information you provide to make your reading experience better, and we will never sell your data to third party members.


Making lab data work better

Centralized and standardized analytical data management is changing drug research and development, and all of chemistry, bit by bit

By Poornima Apte, 
C&EN BrandLab Contributing Writer

Credit : Shutterstock

The COVID-19 pandemic, which triggered a widespread and sudden shift to remote work, thrust a problem to the forefront: laboratories urgently need a centralized and searchable library of all analytical test results and associated metadata, accessible on demand to everyone who needs it.

Traditionally, sharing the results of an experiment among colleagues is inefficient, with scientists either emailing large files back and forth or manually assembling documents, then passing them along. An alternative approach is to centralize all results into a unified library. Scientists in the structure elucidation group at Pfizer have implemented such a strategy and can now access results from wherever they are. Pankaj Aggarwal, a principal scientist at Pfizer, says, “We can do the data analysis and even control our instruments from home.”

Having all a laboratory’s data in one place has other benefits, according to David Foley, a senior principal scientist at Pfizer. A scientist unaware of or unable to find data from a particular experiment too often simply runs that test again, wasting time and laboratory resources. A centralized library “removes the duplication of work,” Foley says.

Time saved in creating a central library for analytical results is critical because drug discovery and production cycles in the pharmaceutical industry are becoming increasingly compressed. “In the past, projects would take 10–20 years,” Aggarwal says. “Now we are talking about products getting to market in 5 years.”

Wrangling data

The data problem is not limited to access. The results from analytical experiments arrive in such varying formats that cataloging them all and accessing desired files at later dates becomes time consuming and tedious.

ACD/Labs has been standardizing analytical data—converting disparate data sets from myriad instruments—via its software for over 20 years, says Sanji Bhal, the firm’s director of marketing and communications. More recently, the Allotrope Foundation has been working to create a unified ontology for analytical techniques in the life sciences and pharmaceutical industries.

Analytical data originate from a large number of manufacturers that service the markets for mass spectrometers, liquid chromatography systems, nuclear magnetic resonance (NMR) spectrometers, and other laboratory instruments. Each manufacturer typically delivers data sets in its own format, through proprietary software that’s incompatible with any other. Since scientists don’t confine themselves to just one technique to assay a compound during, for example, structure identification or compound registration, they end up using several pieces of associated software for each technique and having to assemble that disparate data to make decisions.

Such a process leaves scientists mired in mundane tasks related to data management instead of moving on to more meaningful data analysis. “Scientists have generally learned to live with it, but the hope is that standardization simplifies the process from data acquisition to decision.” Bhal says.

Get smart

Another benefit of data standardization is that it shifts the scientist’s focus from the analytical technique to the molecule, which is where it should be. With ACD/Lab’s help, Pfizer is developing a central library of scientific information about molecules. “It’s like when you walk into a library and you have different sections on historical fiction or personal finance,” says Vijay Bulusu, Pfizer’s head of data and digital innovation for pharma science R&D. “Similarly, we’re bringing all this analytical and scientific information—with the multiple sections being NMR, LC-MS [liquid chromatography–mass spectrometry], etc.—into one library.”

In the pharmaceutical industry, scientists are using software for each of those library sections to find subtle patterns in data—for example, a suspicious peak in a spectrum arising from an impurity that consistently happens with samples from one lot. Algorithms can also find anomalies and similarities in analytical data, drawing scientists’ attention to correlations that might otherwise be lost. Such algorithms are like navigation aids that let chemists rapidly maneuver through large volumes of data, finding insights that could have been missed. “If you had a paper road map and hit a roadblock, you might not immediately find a way around it,” says Karim Kassam, ACD/Labs’ senior director of customer success. “But a live, interactive map would make navigation much easier.”

Data standardization also sets the stage for advanced technologies, such as artificial intelligence (AI) and machine learning (ML). “We’re starting to develop AI and ML models and algorithms with these big data repositories we are creating,” Bulusu says.

Every sample analyzed by the structural elucidation group at Pfizer is tagged with an electronic number, which can be used to track all its associated tests. “We envision that in the future, any analyst will be able to use a compound number and structure and get all the [associated] analytical data, irrespective of the technique,” Aggarwal says. “We are creating individual data libraries, but we are not creating data silos.”

To learn more, watch this webinar on how various analytical groups at Pfizer are surmounting the data management problem and improving their productivity and data access.
Vijay Bulusu
Head, Data & Digital Innovation, PharmSci, Worldwide Research & Development, Pfizer
David Foley
Senior Principal Scientist, NMR & MS Team Lead, Pfizer
Pankaj Aggarwal
Principal Scientist, LC Method Development Team Lead, Pfizer


This article has been sent to the following recipient: