Ian Birkby, CEO at News-Medical | Official Website
+ Pharmaceuticals
Patient Daily | Mar 22, 2026

AlphaFold Database expands with millions of predicted protein complexes

A collaboration between EMBL's European Bioinformatics Institute, Google DeepMind, NVIDIA, and Seoul National University announced on Mar. 17 the release of millions of AI-predicted protein complex structures through the AlphaFold Database. The new dataset prioritises proteins relevant to human health and disease and is described as the largest collection of its kind currently available.

The expansion aims to help researchers better understand how proteins interact in living organisms, which is essential for uncovering molecular mechanisms behind cell behaviour, disease processes, and drug development. Predicting protein complex structures has been a significant challenge due to the dynamic nature of protein interactions.

"Science thrives on collaboration," said Jo McEntyre, Interim Director of EMBL-EBI. "By making this foundational protein complex dataset openly available to the world, we're inviting researchers to test, refine, and build on it to drive the next wave of biological discoveries."

The update includes millions of homodimers—complexes formed by two identical proteins—and covers 20 widely studied species as well as bacterial pathogens identified by the World Health Organization. Anna Koivuniemi, Head of the Google DeepMind Impact Accelerator, said: "By expanding the AlphaFold Database to include protein complexes, we are addressing a critical need expressed by the scientific community. We hope that by lowering the barrier to these complex predictions, we can empower researchers everywhere to pursue the next wave of discoveries that could ultimately improve human health on a global scale."

Since its launch in 2021 using Google DeepMind's AlphaFold AI system, the database has grown rapidly and now serves over 3.4 million users from 190 countries. The latest effort involved technical innovations from NVIDIA and Seoul National University's Steinegger Lab to accelerate calculations needed for large-scale predictions. Anthony Costa, NVIDIA Director of Digital Biology, said: "NVIDIA's ambition is to consistently contribute orders-of-magnitude accelerations for fundamental digital biology workloads, enabling what was not possible before." Martin Steinegger from Seoul National University added: "By making predicted protein complexes accessible at an unprecedented scale, we are illuminating an unseen landscape of molecular interactions across the tree of life."

The project required advanced AI infrastructure capable of handling data that would otherwise take about 17 million hours of GPU computing time if recreated independently. By centralising these resources in one open-access database, scientists worldwide can more easily study protein interactions and potentially accelerate breakthroughs in medicine and biology.

Dame Janet Thornton, Director Emeritus of EMBL-EBI concluded: "The human genome has just over 20,000 different proteins... Adding predicted protein-protein homodimeric interactions to the AlphaFold Database is a first step towards a comprehensive description of the human interactome... Making these structures accessible to all allows every researcher around the world to build on these data, moving one step closer to predicting the biology of life."

Organizations in this story