Nehaveigur

Lab-in-the-Loop: The self-driving research laboratories speeding up drug discovery 

Artificial intelligence isn’t going to solve drug discovery, at least not the way some people imagine it is going to. Even a hyperintelligent AI, having access to all the world’s data, wouldn’t be able to deliver a new, first-in-class clinically validated drug. As I’ve written before, the reason is that most hard problems in drug discovery, and in biology more generally, are computationally irreducible. This means that computation can’t replace real-world experiments as the ground truth. 

AI in drug discovery will therefore be limited without new data, but it can potentially help to generate new biological data. Imagine an agentic AI that has access to a large, automated laboratory that’s mostly run by robotics. The AI proposes hypotheses, designs experiments to test them, instructs the lab robots to execute them, collects and interprets the results, and uses them to revise the hypotheses. This iterative approach is called lab-in-the-loop, or LITL. 

The speed of iteration is a determinant of success. This is true for the type of drug and software development projects that I have experience with, but it’s also true in other contexts such as warfare

Conceptually, LITL isn’t that different from the way science is done today: Iterative experiments and hypothesis revision have been part of the scientific method since the beginning. What’s special about AI designing and executing experiments is that it will potentially increase the speed of iteration, and therefore the speed of drug discovery. 

Besides speed, there are other aspects of science in which AI may have an advantage over humans. Even the most high-throughput system can’t do all possible experiments. It’s important to run the experiments with the highest expected payoff. Those are often the experiments that have the potential to reduce uncertainty the most. By building a digital representation of the biological system, AIs may be able to identify those gaps faster and more reliably than we can. This is different from old-school machine learning, where the AI was asked to predict what the most likely hit was. A LITL AI is instead expected to chose the next experiment that will ultimately maximize hit rates. LITL AIs are conceptually closer to reinforcement learning and generative AI.

Well-defined optimization problems are particularly amenable to closed-loop approach. My own field of genomics, with its long history of large-scale data generation, is a good match. Changing protein levels with antisense oligos and predicting the effect of missense and coding variants are potential applications that already make a difference in drug discovery. For those problems, we already know how to design high-throughput experiments.  How long it will take to extend the LITL paradigm to more complex, less scalable biological questions remains to be seen. More complexity will require more agentic AI that can combine heterogeneous experimental designs. We’re already seeing such agentic AI in software development and data analysis, and it’s just a question of time until we’ll also see them in drug discovery and experimental biology. Industry giants like Genentech, newer companies like Recursion and Insitro, as well as more recent startups like Lila Sciences are pouring hundreds of millions of dollars into building LITL systems. 

Even so, I remain skeptical that this will lead to more than iterative improvements in our rate of drug discovery. The biggest bottlenecks in drug discovery today are related to clinical trials, and those are unlikely to be resolved by LITL or other AI approaches in the next ten years.