The science research community is known for its enthusiastic engagement with breakthrough technologies. It also has a history of confronting ethical concerns posed by these technologies well after they are in common use. This history is repeating itself this year after the rise of generative AI models that people without coding expertise can use. Some scientists are calling for a pause in activity to weigh risks and establish guidelines. Others, however, are resisting not only the pause in deployment but also any restrictions on the use of a fast-morphing tool that they say will benefit from unimpeded experimentation.
This year may well go down as the year of ChatGPT. The generative artificial intelligence program, and others like it, made news as the public dived into using AI platforms with no more than common, conversational language. People created mash-up images, passable sonnets, and term papers that might just fly with the professor. The barometer tilted from fear of doom to enthusiasm for AI.
The backlash made the news as well. Geoffrey Hinton, whose laboratory at the University of Toronto invented the technological foundation for generative AI, came out among the technology’s critics in May, warning of the dangers in moving full throttle to develop platforms like ChatGPT. Critics are concerned that by eliminating the need to code to deploy AI, the technology provides a handy tool to the worst bad actors.
After-the-fact reckoning is nothing new in science. The rush to put breakthrough technology to use in laboratories has more than once led to a kind of armistice between enthusiasts and those concerned with consequences that should have been imagined before the technology came out of the box. DNA modification and gene editing are prominent examples of fields where research leaders called for a temporary slowdown in activity to arrive at a set of agreed-on ethical guidelines.
A similar call can now be heard with generative AI. People in research labs, professional organizations, and the social sciences say now is the time to weigh the ethical implications of generative AI in science research.
But AI is a tool that changes much more rapidly than recombinant DNA and gene editing did. And this time, advocates for a science ethics summit are determined to convene a broadly interdisciplinary meeting. They want to gather not only chemists and biologists but also social scientists, science historians, ethicists, and others with a perspective on the impact of new technology in the lab.
The problem is that chemists and biologists are not entirely on board. Many reject the idea of a cooling-off period at a time when experimentation is a necessary aspect of directing the evolution of AI. The stakes are high, given the dystopian future that critics say could result from unbridled AI.
“Science research, whether it’s the use of AI in specific endeavors or research on AI itself, is not adequately focused” on ethical concerns, says Risto Uuk, European Union research lead at the Future of Life Institute (FLI), an organization launched in 2014 to guide the development of technologies toward human benefit.
The FLI recently negotiated with lawmakers in the EU on a draft regulation for developing and using AI. Uuk points out that the draft makes no mention of guidelines for deploying the technology safely in scientific research. And proposed amendments filed by researchers all asked regulators to steer clear of setting any restrictions on AI deployment in scientific research.
“The draft regulations kind of assume that if somebody does research, then they should just do it very freely and there should be no requirement for them apart from some kind of ethical review questions if you experiment with human beings,” Uuk says.
Uuk calls this a misguided approach that will impede efforts to develop AI as a beneficial technology. “Oftentimes, issues that arise in the research space present themselves downstream in the deployment of AI systems,” he says.
Theresa Harris, program director of the Center for Scientific Responsibility and Justice at the American Association for the Advancement of Science (AAAS), a science advocacy group, says many in the research community are aware of ethical concerns and frustrated by the lack of safeguards to prevent problems, especially in the area of data generated by AI.
“The traditional systems, such as the institutional review boards, aren’t well suited to the specific ethical concerns being raised right now,” Harris says. While AI can facilitate “straight-up fraud,” it is the more nuanced generation of insupportable data that is of most concern, she says. “The traditional process of understanding what is original and what is manufactured or copied—AI is raising a lot of new questions about that.”
Another concern is how AI perpetuates bias in existing data, Harris says. And measures to ensure data security have not yet been established, she adds.
Harris and others point out that AI itself doesn’t create the ethical concerns. Rather than introducing problems, the technology tends to accelerate the pace of research, including unethical practices. Just as human researchers are needed to direct AI systems—establishing data models and assessing outcomes—ethical researchers are the foundation of the ethical use of AI, they say.
Over the horizon, however, are machines that combine AI and robotics and are capable of executing on their own decisions, eliminating humans from the loop.
The prospect of possibly rogue, malevolent AI cannot be ignored, but ethicists see more immediate concerns arising from systems already in place. Publishers of scientific journals, for example, are already scrambling to combat the effect of dubious, often referred to as “hallucinated,” data inserted into submitted papers by generative AI models. They haven’t gained much traction.
Holden Thorp, editor in chief of the Science family of journals published by AAAS, says he is wary of publishing papers based on research that uses AI. “We have taken a pretty restrictive stance,” he says, given the lack of consensus in the research community about how the technology should be used. Thorp also refers to a decades-long headache caused by a disruptive technology that emerged in the 1990s: Photoshop.
Researchers jumped for Photoshop to improve the presentation of images of gel electrophoresis and other lab experiments. “Sometimes they crossed the line,” Thorp says. “It was years before it was well understood what you could do to your gels and what you couldn’t. And that created this bolus of papers that we are constantly relitigating.”
Generative AI may already be creating another such backlog. “I hope this analogy doesn’t play out all the way, because we are now just getting tools to detect image manipulation,” Thorp says. “And it’s been 25 years.”
James Milne, president of the Publications Division of the American Chemical Society, which publishes C&EN, is more focused on how AI can advance science than on its potential to generate a wave of dubious papers.
“I know from researchers that generative AI has been an active talking point in the last 9 months or so, and mostly about how AI can advance their research in a really positive way,” he says. “So it’s the classic thing. There are concerns but also opportunities. Most people are looking at the land of opportunity.”
When ACS updated certain policies earlier this year, it incorporated guidance on AI, Milne says. The move was prompted by the emergence of generative models such as ChatGPT, he says, noting that AI itself has been in use for years. “You can use it to tidy up your paper or analyze data, but you must be clear where you are using AI technology,” he says.
While technology for flagging AI- generated data is still under development, the classic quality-control method is in effect and up to the job, Milne says. “Editors will sort through papers as they’re received, based on their really deep understanding of the science.”
Researchers in chemistry and pharmaceutical labs are, like Milne, focused on the positive when it comes to AI, and they are already reaping the benefits. The speed at which Pfizer gained approval for its COVID-19 drug Paxlovid was bolstered by the company’s deployment of machine learning, for example.
Another example is the recent use of ChatGPT at the University of California, Berkeley, to create datasets about highly porous metal-organic frameworks (MOFs). The project demonstrates how researchers that are not expert coders can deploy powerful AI using a common- language generative model, according to Omar M. Yaghi, a chemistry professor at Berkeley who led the effort.
“My interest and my group’s interest in ChatGPT is to simplify and speed up the process of materials discovery,” Yaghi says. The researchers want to use machines to revamp a process that involves many experimental steps and hours of routine work. “Some of it has to do with how you make observations about the results of different trials,” he says. “Humans are very important in this path to materials discovery. But humans can make mistakes.”
Yaghi says he uses common, conversational language to instruct ChatGPT to carry out certain essential tests. The program can be trained to mine diverse texts as it investigates methods of preparing MOFs. “There is a lot of heterogeneity in how [researchers] report,” Yaghi says, adding that ChatGPT collated data from about 220 papers and tabulated them with over 95% accuracy.
As for the ethical implications of turning the research process over to a machine that is known to generate inaccurate data? “I don’t see any,” Yaghi says. “I think this is a very useful tool.”
Yaghi emphasizes that the program is trained by scientists who look at results to see what makes sense. “They have to corroborate results and follow a rigorous scientific method,” he says. “That doesn’t disappear with ChatGPT or AI at this stage of its development.”
And scientists are not forging ahead heedless of AI’s shortcomings, Yaghi adds. “We address hallucination in our work,” he says. “Yes, that is a risk, but you can actually instruct ChatGPT and say, ‘If you don’t know the answer for sure, please do not give me an answer.’ ”
Regina Barzilay, a computer science professor at the Massachusetts Institute of Technology and colead of the Machine Learning for Pharmaceutical Discovery and Synthesis Consortium, views AI as “a true turning point in technology” that will free scientists from data drudge work and repetitive experimentation.
Ethical considerations fall into two categories, Barzilay says: appropriate use in accordance with the scientific method, and use for an appropriate purpose. The first consideration is handled through established protocols and science culture. “The second is a regulatory question,” she says.
Reporting data from experiments augmented by AI poses a challenge, Barzilay acknowledges, but one that is managed when priming the system with data at the outset of an experiment. From there, the familiar process of peer review and repeated experimentation takes over. A technology that throws an unexpected curveball from time to time is nothing new, she says.
Barzilay views AI as distinct from DNA manipulation and gene editing in that it introduces nothing new that would raise ethical questions. “In some ways, AI is not new functionality,” she says. “You are doing stuff with chemistry that you used to do the old way.”
Alán Aspuru-Guzik, a professor of chemistry and computer science at the University of Toronto, agrees. He views the big risk in chemistry as the production of molecules with unexpected downstream consequences and the environmental impact of chemicals generally—problems that are not caused by AI. Rather than creating threats, he says, AI can actually help navigate them.
Still, researchers are conscious of how AI can facilitate negative results. “We all think about it in my lab,” Aspuru-Guzik says. “Many scientists are thinking about it.”
While many researchers say AI as it exists today need not—and even should not—be reined in over ethical concerns, the question lingers: How long will AI exist as it does today? With the rise of fully automated laboratories, some foresee a seamless integration of computers and robots capable of designing and producing things without human researcher oversight and intervention.
Rafael Gómez-Bombarelli, a professor of materials science and engineering at MIT, speaks of an execution gap—the disconnects between AI and robotics that require researcher intervention. Today’s laboratory robots, he says, require a high level of fine tuning and maintenance by researchers. But if systems are developed to the point of being liberated from scientists, the negative consequences could reach the realm of new toxics or chemical weapons.
“These are very early days,” Gómez- Bombarelli says. Automated labs are expensive, and getting computers and robotics together is a difficult task. Still, the execution gap is closing, and automated labs could one day be accessible to the masses. “If one succeeds in democratizing AI on demand, we have a problem,” he says.
Connor Coley, a professor of chemical engineering, electrical engineering, and computer science at MIT, sees the concerns as more immediate. But AI systems already in place have safeguards too, he says.
“In principle, the difference between optimizing a molecule to be safe and nontoxic and optimizing a molecule to be toxic is a negative sign,” he says, referring to the idea of dual-use manipulation of algorithms. “In practice, it doesn’t quite work like that. There is still a relatively high skill level involved in taking an idea of a new molecular structure that has been designed computationally and implementing it in the real world.”
Coley is among those who say now is the time to begin the ethics discussion. AI for Science, a series of workshops launched in 2021, has taken up the topic of the ethical application of AI in the lab, Coley notes. At the Gordon Research Conference earlier this year, he chaired a panel titled “Artificial Intelligence and the Modernization of Chemical Synthesis,” which addressed how AI can affect the emergence of chemical and biological threats.
As to whether there’s a way to protect against malicious dual use of algorithms, “the simple answer is no,” says Marinka Zitnik, a professor of biomedical informatics at Harvard Medical School and coordinator of AI for Science. “This is especially true because we are pushing toward open science and open-source algorithms.” Barriers are also coming down as generative AI programs make it easier for people with no coding experience to get involved.
Zitnik says the science community is increasingly calling for broad ethical guidelines. She recently coauthored a paper titled “Scientific Discovery in the Age of Artificial Intelligence,” which outlines the benefits of trainable deep learning and generative AI systems in processing vast numbers of unlabeled data. Both developers and users of such tools need a better understanding of where and how they work, as well as strategies for dealing with poor data quality, she concludes.
Zitnik says guidelines should be developed by a multidisciplinary consortium that includes the social sciences, a field more practiced in weighing ethical concerns than chemistry and biology. “What is required is an all-science, interdisciplinary committee together with experts who are studying the philosophy and history of science in order to have a discussion and potentially draft guidelines on what responsible use would entail.”
The FLI published an open letter in March calling for a 6-month pause in the training of AI systems more powerful than GPT-4, the current iteration of ChatGPT, so that the research community can develop and implement safety protocols. The letter has garnered over 33,000 signatures, including those of Steve Wozniak, cofounder of Apple, and Elon Musk, CEO of Tesla, CEO of SpaceX, and owner of X (formerly Twitter). The letter cites concerns raised and principles agreed on at the Asilomar Conference on Beneficial AI, which took place in 2017.
The Asilomar conference was organized by the FLI at the Asilomar Conference Grounds in Pacific Grove, California. It generated the Asilomar AI Principles, which address technology, funding, and justice issues for AI development broadly.
The Asilomar center was also the site of the 1975 International Conference on Recombinant DNA Molecules, a landmark event in the ethics of scientific research, at which about 140 biologists, as well as lawyers, physicians, and a few ethicists, drew up voluntary safety guidelines. That Asilomar conference provided a template for a similar meeting in 2015 on the ethical use of CRISPR gene editing.
Those concerned with establishing ethical protocols for science research are now calling for a return to Asilomar for a summit on the deployment of AI. But they acknowledge that scientists are unlikely to take up the cause without a sense that they really need to.
“We can debate all day what exactly researchers can do, but at the very least the researchers should have a requirement to reflect more on the possible risks of their work and the possible mitigation that they can take,” Uuk says. “But I don’t think academics or industry would do that voluntarily. There has to be some incentives from law.”
Uuk notes that the UK government will convene a conference on AI safety later this year. The UK AI Safety Summit, scheduled for early November, will take place at Bletchley Park, England, famous for being where Germany’s Enigma code was deciphered during World War II. Billed as a global event, the summit will focus broadly on mitigating the risk of AI through international cooperation.
Moving toward the safe use of AI in research may be a matter of getting to know the machine better, according to Leilani Gilpin, a professor of computer science and engineering at UC Santa Cruz and an affiliate of the school’s Science and Justice Research Center.
“In the spheres I’m in, everybody talks about” ethics, she says. “But I think in the larger scientific community—people who work more in the physical and biological sciences—a lot of them are unaware of how these systems are built.”
Gilpin and her colleagues are working on hallucination, devising ways to computationally flag made-up, guessed-at, and ultimately incorrect data generated by AI. Gilpin’s favorite example of a mistake made by ChatGPT amounts to something like laziness on the part of the program: it identified her as a professor at UC Berkeley, a place she has never taught.
Gilpin emphasizes that the scientists developing AI systems—programming new or updated generative AI models, for example—aren’t likely to slow down for a broad-based discussion of ethics. “Before I worked as a professor, I worked in Silicon Valley for a couple of years,” she says, “and there is not a safety or ethics culture around computer science in Silicon Valley.” The attitude is captured, she says, in a discontinued motto at Meta, formerly Facebook: Move fast and break things. “That is the complete opposite of ethical thinking,” Gilpin says.
Nonetheless, various organizations have challenged the industry by proposing principles and guidelines. Most state as their premise AI’s power to change the world, creating high-stakes ethical responsibility for computer designers, programmers, and users.
The Association for the Advancement of Artificial Intelligence (AAAI) highlights the need for AI to contribute to human well-being with a variation on the Hippocratic oath’s “First do no harm.” The Vector Institute’s AI trust and safety principles include remaining true to democratic principles, maintaining adequate oversight, and ensuring corporate responsibility.
While these lists fall short of addressing the specificities of science research, the high-level ethics guides are useful for thinking about AI, according to Susan Epstein, ethics chair of AAAI’s executive council. She agrees with others that the problems with AI currently accrue from researchers’ use of the tools.
“The best way to produce ethical science is to educate scientists ethically,” she says. But problems are compounded by corporations’ control over the development and deployment of AI.
Epstein says the research community has begun to see the need for a pause. “People are beginning to understand the power of the technology,” she says. Investors in the technology sector are also looking closely at how industry is approaching the ethical use of generative programs, Epstein says. “That is going to have an impact.”
The general public, however, shows no enthusiasm for a contemplative pause in the development of generative AI. “It’s too late,” Epstein says. “Once it gets out to the general public, you lose control.”
And researchers may feel it’s too early for a return to Asilomar to discuss AI. “I don’t think we are at that stage,” UC Berkeley’s Yaghi says. “We are not at the stage where a student . . . pushes a button and the computer gives him a set of conditions to make a new material that is completely different than we’ve known before.”
Yaghi questions what would be accomplished at a science research AI summit. “I don’t have a problem with their setting up an ethical panel, but what are they going to talk about?” he says. The technology is still in an early phase of development, and any panel would have a hard time defining exactly what AI is, according to Yaghi. “These models need to get better and better, but without our participation in it—let’s call it unrestricted participation—they’re not going to get better. We are scientists; we believe in the power of experiment. This is what we do day after day after day. I don’t see this as any different than running an experiment.”