Meta AI releases models of over 600 million potential proteins

This week, scientists at Meta, the firm behind Facebook and Instagram, released the structures of more than 600 million putative proteins in a database called the ESM Metagenomic Atlas. The structures are for proteins predicted to exist based on genetic data from large-scale metagenomic screens of soil, seawater, and other sources. The proteins themselves have yet to be isolated or identified using proteomic methods.

A ribbon depiction of the structure of PETase that came out of the ESMFold prediction algorithm.

Credit: Meta AI

The predicted structure of the plastic-degrading enzyme PETase, using the ESMFold algorithm. The ribbon is colored to show the confidence of the algorithm per amino acid location, with dark and light blue indicating higher confidence and orange and yellow indicating lower confidence.

The team describes the method used to perform this feat in a preprint (bioRxiv 2022, DOI: 10.1101/2022.07.20.500902), which has yet to undergo peer review.

In July, the Alphabet-owned company DeepMind announced that it had filled a database with predicted structures for almost all known proteins. That database holds around 200 million models made using AlphaFold, DeepMind’s algorithm for predicting protein structures. The Meta AI algorithm used to make the new protein models (ESMFold) is not as accurate as AlphaFold, but it is quicker, researchers say. The speed is a result of how the tool predicts protein structures using a language model trained on sequence data—the order of amino acids in the linear chain that make up a protein. The increased speed meant that the researchers could predict the 600 million structures in just 2 weeks, using a cluster of approximately 2,000 graphics processing units.

The Meta AI researchers have also published the code that they used to create the new database. They intend for other scientists to use the tool for their own research.

Pernilla Wittung-Stafshede, a protein folding expert at Chalmers University of Technology, says the new database “gives a really broad view of [the] protein universe on Earth.” But she cautions that structure prediction algorithms are just the beginning, with more work needed to tease out each protein’s function, which she says is the next challenge.

Chemical & Engineering News

ISSN 0009-2347

Advertisement

LATEST

TOPICS

MAGAZINE

FEATURES

COLLECTIONS

PODCASTS

CHEMPICS

JOBS

LATEST

TOPICS

MAGAZINE

FEATURES

COLLECTIONS

PODCASTS

CHEMPICS

JOBS

Protein Folding

Meta AI releases models of over 600 million potential proteins

AI lab from tech company Meta joins the protein structure prediction game and creates models based on metagenomic data

by Laura Howes

November 3, 2022

Advertisement

You might also like...

Advertisement

Join the conversation

Advertisement

TOPICS

MAGAZINE

FEATURES

COLLECTIONS

Grab your lab coat. Let's get started

Welcome!

Welcome!

Create an account below to get 6 C&EN articles per month, receive newsletters and more - all free.

It seems this is your first time logging in online. Please enter the following information to continue.

As an ACS member you automatically get access to this site. All we need is few more details to create your reading experience.

The key to knowledge is in your (nitrile-gloved) hands

Access more articles now. Choose the ACS option that’s right for you.

Thank you!

Meta AI releases models of over 600 million potential proteins

AI lab from tech company Meta joins the protein structure prediction game and creates models based on metagenomic data

by Laura Howes

November 3, 2022

Advertisement

You might also like...

Advertisement

Join the conversation

The power is now in your (nitrile gloved) hands

Sign up for a free account to get more articles. Or choose the ACS option that’s right for you.

Option 1

Create a free account To read 6 articles each month from

Option 2

BEST VALUE

Join ACS To get even more access to

Create a free account
To read 6 articles each month from

Join ACS
To get even more access to