about

resources

events

contribute

republishing

☰

ΑΙhub.org

New AI tool helps match enzymes to substrates

by University of Illinois

24 October 2025

share this:

EZSpecificity combines extensive new enzyme-substrate docking data and a new machine learning algorithm to predict the best pairing for making a desired product, with up to 91.7% accuracy. Illinois professor Huimin Zhao led the study. Photo by Fred Zwicky.

By Liz Ahlberg Touchstone

A new artificial intelligence-powered tool can help researchers determine how well an enzyme fits with a desired target, helping them find the best enzyme and substrate combination for applications from catalysis to medicine to manufacturing.

Led by Huimin Zhao, a professor of chemical and biomolecular engineering at the University of Illinois Urbana-Champaign, the researchers developed EZSpecificity using new enzyme-substrate pair data and a new machine learning algorithm. They have made the tool freely available online and published their results in the journal Nature.

“If we want a certain product using an enzyme, we want to use the best enzyme and substrate combination,” said Zhao, who also is the director of the NSF Molecule Maker Lab Institute and of the NSF iBioFoundry at the University of Illinois “EZSpecificity is an AI model that can analyze an enzyme sequence and then predict which substrate best can fit into that enzyme. It is highly complementary to the CLEAN AI model that we developed to predict an enzyme’s function from its sequence more than two years ago.”

Enzymes are large proteins that catalyze molecular reactions. They have pocket-like regions that target molecules, called substrates, fit into. How well an enzyme and substrate fit is called specificity. The typical analogy for enzyme-substrate interaction is a lock and key: Only the right key will open the lock. However, enzyme function is not that simple, Zhao said.

“It is challenging to figure out the best combination because the pocket is not static,” he said. “The enzyme actually changes conformation when it interacts with the substrate. It is more of an induced fit. And some enzymes are promiscuous and can catalyze different types of reactions. That makes it very hard to predict. That’s why we need a machine learning model and experimental data that really prove which pairing will work best.”

While other enzyme specificity models have been introduced, they are limited in accuracy and in the types of enzymatic reactions they can predict.

Zhao’s group realized that to improve AI’s ability to predict specificity, they needed to improve and expand the dataset that the machine learning model drew from. They partnered with the group led by Diwakar Shukla, a University of Illinois professor of chemical and biomolecular engineering. Shukla’s group performed docking studies for different classes of enzymes to create a large database containing information about not only an enzyme’s sequence and structure, but also how enzymes of various classes conform around different types of substrates.

“Experiments that capture how enzymes interact with their substrates are often slow and complex, so we ran extensive docking simulations to complement and expand on the existing experimental data,” Shukla said. “We zoomed in on the atomic-level interactions between enzymes and their substrates. Millions of docking calculations provided us this missing piece of the puzzle to build a highly accurate enzyme specificity predictor.”

The researchers then tested EZSpecificity side-by-side with ESP, the current leading model, in four scenarios designed to mimic real-world applications. EZSpecificity outperformed ESP in all scenarios. Finally, the researchers experimentally validated EZSpecificity by looking at eight halogenase enzymes, a class that has not been well characterized but is increasingly used to make bioactive molecules, and 78 substrates. EZSpecificity achieved 91.7% accuracy for its top pairing predictions, while ESP only displayed 58.3% accuracy.

“I cannot say it works for every enzyme, but for certain enzymes, we showed that EZSpecificity works very well indeed,” Zhao said. “We want to make this tool available to others, so we developed a user interface. Researchers now can enter the substrate and the protein sequence, and then they can use our tool to predict whether that substrate can work well or not.”

Next, the researchers plan to expand their AI tools to analyze enzyme selectivity, which indicates whether an enzyme has a preference for a certain site on a substrate, to help rule out enzymes with off-target effects. They also plan to continue to refine EZSpecificity with more experimental data.

University of Illinois

AIhub is supported by:

The malleable mind: context accumulation drives LLM’s belief drift

Jiayi Geng 09 Mar 2026

LLMs change their "beliefs" over time, depending on the data they are given.

RWDS Big Questions: how do we balance innovation and regulation in the world of AI?

Real World Data Science 06 Mar 2026

The panel explores the tensions, trade-offs and practical realities facing policymakers and data scientists alike.

Studying multiplicity: an interview with Prakhar Ganesh

Lucy Smith 05 Mar 2026

What is multiplicity, and what implications does it have for fairness, privacy and interpretability in real-world systems?

Top AI ethics and policy issues of 2025 and what to expect in 2026

AI Matters, Larry Medsker and Ella Scallan 04 Mar 2026

In the latest issue of AI Matters, a publication of ACM SIGAI, Larry Medsker summarised the year in AI ethics and policy, and looked ahead to 2026.

The greatest risk of AI in higher education isn’t cheating – it’s the erosion of learning itself

The Conversation 03 Mar 2026

Will AI hollow out the pipeline of students, researchers and faculty that is the basis of today’s universities?

Forthcoming machine learning and AI seminars: March 2026 edition

Lucy Smith 02 Mar 2026

A list of free-to-attend AI-related seminars that are scheduled to take place between 2 March and 30 April 2026.

monthly digest

AIhub monthly digest: February 2026 – collective decision making, multi-modal learning, and governing the rise of interactive AI

Lucy Smith 27 Feb 2026

Welcome to our monthly digest, where you can catch up with AI research, events and news from the month past.

The Good Robot podcast: the role of designers in AI ethics with Tomasz Hollanek

The Good Robot Podcast 26 Feb 2026

In this episode, Tomasz argues that design is central to AI ethics and explores the role designers should play in shaping ethical AI systems.