You might not guess it from a trip to the aquarium, but the natural glow of certain jellyfish and other sea life has become key to biology’s understanding of ourselves, from the spread of cancer cells to the inner workings of the brain. Biologists use the green fluorescent protein (GFP) derived from these creatures as a visual marker to get a better look at all sorts of biological phenomena.
A new startup from a group of ex-Meta scientists claims it can use generative AI to replace the role of the jellyfish in this scenario. EvolutionaryScale says it has used generative AI to create a bespoke GFP that departs from existing luminescent proteins made in previous lab processes.
EvolutionaryScale Chief Scientist Alex Rives said the discovery is an example of how the company’s latest biological foundational model family, ESM3, can synthesize novel proteins based on a prompt. That ability could eventually have implications for drug discovery, sustainability fixes, and beyond, Rives said.
“Most of the diversity in those proteins has come from discovering new ones in the natural world. So a new fluorescent protein in a different species of coral or a jellyfish or some other animal,” Rives told Tech Brew. “Known fluorescent proteins have taken 500 million years to diverge. So you can think about the model as simulating 500 million years of evolution to create a new protein.”
Protein power: EvolutionaryScale emerged from stealth in late June backed by a seed round of $142 million raised from investors including Amazon and Nvidia’s investment arm, NVentures, according to Crunchbase. It’s one of a growing crop of biotech startups using a version of large language models to synthesize new proteins, mostly in an effort to supercharge drug discovery.
True to the startup’s name, Rives said the model’s enormous scale is one thing that sets the company apart in this burgeoning space. Trained on a dataset of 2.78 billion natural proteins—and the company claims “more compute than any other known model in biology”—ESM3 is made up of 98 billion parameters, or the unique training-derived values that define a model.
Keep up with the innovative tech transforming business
Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.
Some members of EvolutionaryScale’s team had been working on earlier versions of the model at Meta’s AI research lab before the project was scrapped last year as part of a broader shift toward more commercial applications of AI.
Rives said these models are also the first of the series to allow for prompting based on inputs like “atomic-level details of the protein structure” or keywords related to “overall fold, topology, or functional properties.”
“It has this ability to bridge the gap from the ideas of a protein designer through to that biological complexity of how do you put that together?” Rives said.
In addition to an API in closed beta, EvolutionaryScale, which is registered as a public benefit corporation, has made a smaller open model available to researchers for noncommercial use.
But, of course, in an era where experts worry that generative AI could be used to develop bioweapons, the company is taking certain precautions, Rives said.
“We’ve taken a relatively conservative approach with this first launch, which is we basically remove the ability for the model to understand these kinds of proteins [related to viruses],” Rives said. “And we had the model reviewed by a group of scientific experts, and we’ve been in communication with the right people in the US government or around this to inform them of our plan.”
Beyond drugs: While drug design is currently the most viable commercial use of synthetic proteins, according to Rives, he wants to eventually explore using synthetic proteins for environmental applications as well, such as creating proteins that can degrade plastic, capture carbon, and improve agriculture.
“You can go out and prospect the natural world and find proteins that already [degrade plastic]. So tools for protein design could make those proteins more more efficient, more stable, more usable in a practical setting,” Rives said. “Our goal is really to just broadly enable the scientific community with tools that can really help them to be creative and think about these problems.”