A new Scripps Research machine-learning system tracks how epidemic viruses evolve. This technology could have predicted the emergence of SARS-CoV-2 “variants of concern” (VOCs) ahead of their official designations by the World Health Organization (WHO). Credit: Graphic made using BioRender.com

Scripps Research scientists develop AI-based tracking and early-warning system for viral pandemics

Machine-learning system effectively predicts emergence of prominent variants.

July 21, 2023

LA JOLLA, CA — Scripps Research scientists have developed a machine-learning system—a type of artificial intelligence (AI) application—that can track the detailed evolution of epidemic viruses and predict the emergence of viral variants with important new properties.

In a paper in Cell Patterns on July 21, 2023, the scientists demonstrated the system by using data on recorded SARS-CoV-2 variants and COVID-19 mortality rates. They showed that the system could have predicted the emergence of new SARS-CoV-2 “variants of concern” (VOCs) ahead of their official designations by the World Health Organization (WHO). Their findings point to the possibility of using such a system in real-time to track future viral pandemics.

“There are rules of pandemic virus evolution that we have not understood but can be discovered, and used in an actionable sense by private and public health organizations, through this unprecedented machine-learning approach,” says study senior author William Balch, PhD, professor in the Department of Molecular Medicine at Scripps Research.

The co-first authors of the study were Salvatore Loguercio, PhD, a staff scientist in the Balch lab at the time of the study, and currently a staff scientist at the Scripps Research Translational Institute; and Ben Calverley, PhD, a postdoctoral research associate in the Balch lab.

The Balch lab specializes in the development of computational, often AI-based methods to illuminate how genetic variations alter the symptoms and spread of diseases. For this study, they applied their approach to the COVID-19 pandemic. They developed machine-learning software, using a strategy called Gaussian process-based spatial covariance, to relate three data sets spanning the course of the pandemic: the genetic sequences of SARS-CoV-2 variants found in infected people worldwide, the frequencies of those variants, and the global mortality rate for COVID-19.

“This computational method used data from publicly available repositories,” Loguercio says. “But it can be applied to any genetic mapping resource.”

The software enabled the researchers to track sets of genetic changes appearing in SARS-CoV-2 variants around the world. These changes—typically trending towards increased spread rates and decreased mortality rates—signified the virus’ adaptations to lockdowns, mask wearing, vaccines, increasing natural immunity in the global population, and the relentless competition among SARS-CoV-2 variants themselves.

“We could see key gene variants appearing and becoming more prevalent, as the mortality rate also changed, and all this was happening weeks before the VOCs containing these variants were officially designated by the WHO,” Balch says.

He and his team showed that they could use this SARS-CoV-2 tracking system as an early warning “anomaly detector” for gene variants associated with significant changes in viral spread and mortality rates.

“One of the big lessons of this work is that it is important to take into account not just a few prominent variants, but also the tens of thousands of other undesignated variants, which we call the ‘variant dark matter,’” Balch says.

A similar system could be used to track the detailed evolution of future viral pandemics in real time, the researchers note. In principle, it would enable scientists to predict changes in a pandemic’s trajectory—for example, big increases in infection rates—in time to adopt appropriate public health countermeasures.

Balch and his colleagues also envision the use of their approach to better understand virus biology and thereby enhance the development of treatments and vaccines. Currently they are using their AI system to uncover key details of how different SARS-CoV-2 proteins worked together in the evolution of the pandemic.

“This system and its underlying technical methods have many possible future applications,” Calverley says.

Understanding the Host-Pathogen Evolutionary Balance through Gaussian Process Modelling of SARS-CoV-2” was co-authored by Salvatore Loguercio, Ben Calverley, Chao Wang, Daniel Shak, Pei Zhao, Shuhong Sun, Scott Budinger, and William Balch.

For more information, contact press@scripps.edu See More News