How AI will change Pharma R&D

1 min read

Here’s what I believe about how AI will change R&D at top 20 pharma companies after reflecting on what's happening in AI, and after more than 500 conversations Elicit has had with most of the top pharma companies and many external advisors.

The Mythos moment for bio is coming

Claude Mythos is a new model that Anthropic decided not to release publicly because it could be used to hack network systems through finding software bugs. It found a 27-year-old bug in OpenBSD, an operating system known for its strong security. This revealed that software is (even) less secure than thought. What is the equivalent for bio? Right now creating harm is difficult. Nature optimizes for transmission, not lethality, so nothing has ever searched the pathogen design space to find the most lethal pathogens.

On SecureBio's Virology Capabilities Test, frontier LLMs already outperformed PhD-level virologists last year at troubleshooting in their own specialty. Expert-estimated annual risk of a human-caused 100k+ epidemic is predicted to rise from ~0.3% to ~1.5% given capabilities we can already anticipate. Of course there’s a large gap between "design in silico" and "viable pathogen in the wild", but the bar for doing things like sequencing your genome at home or, more nefariously, fine-tuning a biological AI model on human-infecting viral sequences is going down every month.

Why does it matter? Operation Warp Speed took “only” 210 days from launch to first EUA for $10B+ against COVID, because mRNA platforms and spike-protein biology were ready from fifteen years of prior work. If a designed pathogen is released on a Friday night, the speed required to address it would be way, way faster, and it would instantly become the top priority for the industry. To what extent can we do the equivalent of preparing and patching our systems ahead of time?

The insight overhang

On the positive side, R&D is likely sitting on a huge insight overhang, in the same way that computer security was sitting on a large overhang of undiscovered security issues. Insight overhang means: there is far more that's deducible from what's already known — by combining the company's own data, the academic literature, mechanism databases, clinical trial registries, and real-world evidence — than has been deduced.

This is a commonly held view among Pharma R&D scientists: what's needed is to make connections across different data sources — say, between a kind of selectivity for a molecule and a subgroup-specific venous-thromboembolism signal that surfaced years later in a competitor's long-term safety follow-up. Yet no human has the breadth, time, and cross-domain fluency to do it. AI will bring the bandwidth to trace implications at scale across these sources.

AI drug discovery hasn't had an amazing run so far. Recursion cut its pipeline in 2025; BenevolentAI restructured back to techbio roots. On the other hand, Insilico's rentosertib, with target (TNIK) and molecule both designed by generative AI, was reported to have hit its primary endpoint in a Phase IIa trial last summer, and the sentiment seems to be changing rapidly.

So how will this work? For one path that we know many companies are developing, consider reverse translation. After a Phase II trial fails, you ask - were we right about some of the components of the hypothesis? Did the target actually get engaged? Was the signaling pathway activated? Then repurpose the findings to improve this or other programs. This will usually be a multi-hop inference that connects two or three findings, possibly living in different parts of the company, where no single human was in a great position to connect them.

But there are at least two reasons we might not realize the overhang, even with strong AI models.

The walls inside

Something we've heard over and over, most recently from a pharma executive with 30 years across R&D, commercial, and portfolio strategy:

Sometimes there are more walls between the functions inside pharma than between different pharma companies. You'd have an easier time going from R&D at one company to R&D at another than crossing functions inside your own. It's like Spanish and French. Both Latin, but they don't really understand each other.

The target-validation scientist doesn't talk to the HEOR team, who doesn't talk to the access team negotiating with German payers, who doesn't talk to the pharmacovigilance group sifting case reports. Often they redo similar work, looking at the same info with slightly different lenses. Some of these walls are important for managing coordination overhead, others for regulatory compliance or IP protection. Commercial and medical teams, for example, are kept at arm's length so commercial incentives don't bias medical judgment - but AI can preserve that separation by passing only the insight-relevant signal across the wall and filtering out the biasing context. Because many insights require connecting the dots, and because AI will be increasingly good at it, the competitive cost of keeping up these data walls will increase.

How can we measure those costs? People like to treat FDA approval as the finish line, but 36% of US drugs approved between 2012 and 2017 missed year-one sales forecasts by 20% or more, and 70% of those never caught up in years two and three. We've heard from multiple sources (including Deloitte, and an advisor with 25 years in pharma market access) that these walls are part of the explanation for this outcome.

The verification problem

The second problem is that not everything that looks like an insight is real, so you have a verification problem. R&D is the first place where this bites, so let's introduce it here. AI reasoning is great in cases with tight feedback loops. Ajeya Cotra:

You should expect AIs to be much better at things that there are tighter feedback loops on, where you can recognise success after a short period of time. That's one of the reasons why they're really good at coding, because you can just train them on this very hard-to-fake signal of: did the code run?

Some of R&D has this property. Find the paper that established this kinase's role in autoimmune signaling. Find the archival trial that tested a close analog. Find the compound class matching these activity constraints.

Most of R&D isn't easy to verify. What is the optimal dose-escalation strategy in patients with impaired hepatic function? Which fifteen programs should we greenlight from eighty candidates? What should the oncology strategy look like in 2030? These questions have better and worse answers, but you can't easily look at an answer and check how good it is. Correctness is about whether you followed a good process in coming up with the answer, the sort of process you'd expect to succeed at identifying the right answer ex ante.

Target validation is definitely hard to verify. A wrong target looks right until the Phase II readout, three to five years and several hundred million dollars downstream. The Claude Mythos system card flags that this model is still weak on "epistemics, calibration, distinguishing correlation and causation, figuring out what strategies will or won't work in practice in domains where it can't easily check."

Prioritization in a target boom

But let's say you solved the issues above to get to a long list of promising targets. I think it's plausible that R&D will soon produce more plausibly-validated hypotheses than clinical development can absorb. Today the number is tiny, maybe four or five big new target evaluations per year, with two or three turning into major priorities. Putting AI aside, we're already hearing about efforts to find and pursue more targets in parallel and compress the cycle times.

We don't know what the multiplier here will be with AI but it's easy to imagine that it's 10x+ and the bottleneck shifts from early research to the clinical stack. This means that portfolio committees will have to decide between a much larger number of promising targets, so (1) doing this well becomes a competitive edge and (2) this is itself the sort of decision that is hard to check. Those who figure out how to leverage AI for it will fare best.

Strategy in a time of change

I remain confused about what will happen. Relative to the AI capabilities I expect to see in the next 5-10 years - models that are vastly superhuman for effectively all tasks, autonomous robots running factories - a lot of the thinking above seems tame. If things change as rapidly as I expect, the amount of adaptation needed will be much greater. Being able to quickly integrate new technologies, respond to the changing environment, and leverage AI to make better strategic decisions will become the most important factor.

For better and worse, we've heard that senior management in pharma loves AI - the more senior the more enthusiastic - but not everyone (they say) is great at checking whether the AI advice holds up, even when it’s being used to inform high-stakes decisions.

So, if I had to compress this whole memo into a single bet: cross-cutting verification infrastructure is the highest-leverage AI investment in pharma. If you're at a top-20 pharma and this sounds relevant, reach out to be part of our early access program.