Researchers have announced on Apr. 2 that machine learning models have led to the discovery of thousands of previously unknown bacterial immune defense systems. The studies, conducted by Peter DeWeirdt and colleagues as well as Ernest Mordret and colleagues, used computational tools to analyze large numbers of protein sequences from bacteria.
These findings are significant because bacterial immune systems serve as protection against viruses called phages, and their precision has inspired technologies such as CRISPR gene editing. By uncovering more of these systems, scientists hope to expand the range of tools available for biotechnology applications.
DeWeirdt's team developed a model called DefensePredictor that uses genetic information from proteins and their neighboring genes to predict involvement in immune defense. Testing this tool on 69 strains of E. coli resulted in predictions of hundreds of new immune systems, with experimental validation confirming defensive function in 42 cases. Their broader analysis across 1,000 bacterial genomes revealed nearly 3,000 distinct protein clusters not previously linked to known immunity mechanisms. DefensePredictor is now available as open-source software for other researchers.
In a separate effort, Mordret's group created additional machine learning models capable of predicting antiphage defense systems at scale. Applying these models to over 120 million proteins identified hundreds of thousands of candidate antiphage families.
Together, these studies show that bacterial immunity is much more widespread than was earlier understood and suggest that future research may uncover even more ways bacteria defend themselves against viruses.