Artificial intelligence has enabled researchers to trace the evolution of genetic control elements in the developing mammalian cerebellum. An international team, led by biologists from Heidelberg University and the Vlaams Instituut voor Biotechnologie and KU Leuven in Belgium, developed advanced AI models capable of predicting the activity of these elements using only their DNA sequence.
Genetic control elements are DNA sequences that determine when and where genes are activated. Changes in their activity can lead to evolutionary developments such as brain expansion. The cerebellum, a brain region involved not only in movement and balance but also cognition, emotion, and language, has notably expanded during human evolution.
To address gaps in understanding how these elements evolved, the research team applied AI-based analysis tools to complex biological datasets. "Customized tools for the AI-based analysis of comprehensive and complex datasets in the life sciences have allowed us to decode the sequence grammar and hence the genetically coded activity profiles of these control elements," said Prof. Dr Stein Aerts, computational biologist at Vlaams Instituut voor Biotechnologie and KU Leuven.
The scientists used modern sequencing technologies to map element activity in individual cells from developing cerebella across six species: human, bonobo, macaque, marmoset, mouse, and opossum. They trained machine learning models on this data to predict control element activity directly from DNA sequences. These models accurately predicted activity both within these species and across other mammals. "This goes to show that the sequence rules that define genetic control elements in cerebellar cell types have been highly conserved throughout mammalian evolution," explained Dr Ioannis Sarropoulos, co-first author of a paper on this research.
Using their AI models' ability to identify conserved sequence patterns, researchers predicted control element activity across 240 mammalian species. For each human element studied, they determined whether corresponding sequences were active in other mammals. This approach allowed them to reconstruct evolutionary histories of regulatory programs with high resolution and pinpoint those contributing to key innovations in humans.
One finding was a newly identified control element near THRB—a gene encoding a thyroid hormone receptor present in all vertebrates—which now operates in cerebellar stem cells due to this new regulatory sequence. According to Prof. Kaessmann from Heidelberg University, "That an evolutionarily ancient gene can be repurposed for novel functions is a key mechanism by which evolution drives innovation."
Researchers from Göttingen and Leipzig as well as Hungary and the United Kingdom also contributed. Funding came from organizations including the European Research Council, European Molecular Biology Organization, and Simons Foundation. The study results appear in Science.