Publication

Learning together: AI-driven drug discovery needs a federated approach

By
March 31, 2026
Abstract

Produced by: Nature Custom Media and Revvity Signals

AI and machine learning models have the potential to transform drug discovery. Improving access and pooling training data is needed to translate this promise into tangible results.

The emerging opportunity to accelerate drug discovery with AI and machine learning (AI/ML) is clear. “Wherever we look in the drug development pipeline, we find AI or the desire to use it,” says David Gosalvez, chief strategy officer at the scientific software company Revvity Signals. But using these tools effectively isn’t as simple as flipping an on switch.

After more than 30 years in scientific software development, Gosalvez has a talent for recognizing the operational pain points that can come with new technology. “Everybody wants to capitalize on AI and machine learning, but the reality is that many organizations are not ready to make that transition.”

The challenge lies in logistics. Effective AI/ML adoption requires infrastructure, technical knowledge, and a broad, diverse base of high-quality training data. If labs don’t have these, models are often adopted enthusiastically but not properly integrated.

Smart software use could help to address this bottleneck, making AI/ML models accessible to more users and, through federated learning, improving the models themselves by drawing from a deeper pool of experimental training data.

From SaaS to MaaS

The challenges drug developers face when integrating AI/ML models are not mysterious. They’re the mundane struggles that come with computational infrastructure — acquiring servers, updating software and shepherding data files from one program to the next. Even when built, the bespoke nature of the infrastructure often makes remote collaboration difficult. Collectively, these challenges create a drag on research productivity.

Managing technical infrastructure isn’t new to Gosalvez and the Revvity Signals team. Over the past decade, they’ve built a Software-as-a-Service (SaaS) platform for exactly this purpose. SaaS is a common way to provide tools that would otherwise be out of reach for most laboratories. SaaS providers like Revvity Signals host and manage applications for customers on cloud infrastructure, making a suite of tools accessible through the web.

Gosalvez and his colleagues reasoned that a SaaS-like approach could help improve access to AI/ML models too, leading them to develop Signals Xynthetica.

Signals Xynthetica is designed to integrate AI/ML models into Revvity Signals existing SaaS ecosystem. “It’s a system wherein experimental data can be captured and easily funneled into the modelling space,” Gosalvez explains. Because the programs are tightly integrated, data such as sample names and experimental results can move seamlessly between tools, helping to narrow the gap between computational prediction and decision-making at the lab bench.

“When researchers are doing their experimental planning, collaborating with external partners, making decisions and executing experiments, they're already using Revvity Signals software,” says Gosalvez. “Signals Xynthetica just helps them access the predictive power of AI/ML while doing it.”

The platform allows users to either upload their own models or leverage what Revvity Signals and its partners have already built and trained. This removes friction for researchers hoping to use AI/ML models. “IT professionals like this because Signals Xynthetica handles all the data security, data privacy, the model infrastructure, training, execution and tuning,” says Gosalvez. “That's why we call it ‘Models-as-a-Service’.”

The benefits of the Models-as-a-Service (MaaS) approach go beyond infrastructure — as well as simplifying access to AI/ML models, the platform also makes it easier to improve them.

Federated learning

A persistent barrier in the development and use of AI/ML for drug discovery is the need for training data at scale. A single drug developer’s internal data capture only their own proprietary slice of the science, meaning models trained on them alone are often too narrow to generalize as modalities, chemistry and targets evolve.

“If you're trying to build a model that can generalize to a broader chemical space based solely on your own data, you’ll feel like a little spaceship attempting to explore a giant universe,” says Aliza Apple, global head of Eli Lilly and Company’s TuneLab. “There's a huge opportunity in thinking about what we can do together to improve our collective ability to extrapolate.”

Lilly TuneLab is a collaborative platform that enables laboratories across pharma and biotech to work together on training AI/ML models1. The platform is designed around the concept of federated learning — a method for collective learning without ever sharing data externally. A model can access data at a local node such as an individual biotech’s cloud server for training, and send insights up to an aggregator. The aggregator collects information from multiple nodes and uses that to update the global model for collective use.

This approach gives participating biotech companies access to continually improving models that would otherwise be out of reach.

“The only way we can make significant advances as an industry is to pool our data,” says Constantine Kreatsoulas, senior vice president at Circle Pharma, a developer of macrocycle-based cancer therapies. “Until recently, that’s been difficult because data in this space is closely guarded.”

Data privacy and IP protection are longstanding barriers to scientific collaboration, but federated learning helps address that, Apple says, because “the local data that feeds TuneLab never leaves the local environment and never co-mingles with any of the other nodes,” thus preserving the sanctity of proprietary data.

At its core, Lilly TuneLab provides access to select Lilly AI/ML models trained on proprietary program data from multiple decades of drug development. In exchange, participating labs contribute data that further refines the shared models in an industry-wide closed loop. For Circle Pharma, one of the first biotechs to join TuneLab, access to such diverse training data is compelling.

“We may benefit from one another,” Kreatsoulas says, “because even if there is no additional data from others on Circle’s modality, macrocycles, there may be features from other moieties — such as PROTACs or peptides — that help improve the model for macrocycles and vice versa.”

A compelling combination

Already, more than 70 companies are using and contributing to Lilly TuneLab, and that number is likely to grow thanks to its integration into R&D platforms like Signals Xynthetica.

Signals Xynthetica provides direct access to TuneLab and makes it simple for researchers to contribute data back into the closed-loop ecosystem. “This collaboration with Revvity Signals is really about figuring out how to meet the scientists where they're already working and make the tools available to them there,” Apple says.

“Ultimately, these models are precompetitive, so it benefits us all to improve them together,” adds Gosalvez. “Signals Xynthetica and Lilly TuneLab make that possible.”

MaaS and federated learning could make for a potent combination — one that deeply embeds AI/ML models within daily research workflows while developing closed-loop learning systems that refine predictions against a wide pool of emerging experimental data. Kreatsoulas is optimistic about what might be achieved with a federated approach.

“It’s helping us rethink how we do science together.”

Discover how TuneLab at Revvity empowers drug discovery companies with federated learning.

References

  1. Marshall, A. Nature Biotechnol. 43, 1743-1746 (2025)

Build your Federated Network