When faced with a mountain of gene expression data, reducing the computational complexity for analyzing a cell’s RNA transcripts is an invaluable development.
Researchers at the Bader Lab in the Donnelly Centre have developed an algorithm they call FLASH-MM, which allows the analysis of individual cell transcriptomes to take minutes rather than days. In one test, the team analyzed a real biological dataset of 500,000 T-cells in under two hours—nearly fifty-four hours faster than standard methods—without sacrificing accuracy.
“In practice, this turns analyses that used to take days into something you can run in a single afternoon,” says Delaram Pouyabahar, second author of the Nature Communications paper and PhD graduate from the Bader Lab. “FLASH-MM can be a game-changer in single-cell genomics, as it enables the use of powerful statistical models at any practical number of cells.”
FLASH-MM is an estimation algorithm designed to be used in differential expression (DE) analysis, a method which identifies changes in gene expression across experimental conditions in large datasets of RNA sequencing data. Analyzing these datasets—comprised from the data of hundreds of thousands to millions of cells—is like analyzing a city population of people’s speech patterns from hundreds of hours of recorded content; there is a high variance between subjects, and a high correlation in a single subject dataset.
It is a wealth of information, made larger with every new advancement in cost efficiency and sequencing technology. Researchers currently use a statistical method known as a linear mixed-effects model (LMM) to provide a framework to address these difficulties, but the process has one primary drawback: this method is computationally demanding, requiring high volumes of memory and lengthy analysis runtimes.
FLASH-MM fixes these issues without sacrificing accuracy, explains first author and senior research associate Changjiang Xu.
“We leveraged precomputed aggregate representations that captured essential information from the data without storing measurements for each individual cell,” says Xu. “By transferring the computation of high-dimension matrices to a lower dimension in the model estimation step, FLASH-MM achieves both computational efficiency and significantly lower memory usage.”
“It provides a fast and scalable way to fit mixed models, making statistically sound DE analysis practical at modern single-cell scales,” says Pouyabahar. “This supports more reliable analysis of large atlases, as well as studies involving subtle perturbations, continuous gradients, and rare cell populations where accounting for cell level variation is important.”
In addition to the simulation studies and the team’s testing on tuberculosis T-cells and healthy kidney single-cell data, FLASH-MM's utility is already being shown with in-progress research across a variety of biological contexts.
“FLASH-MM let us test for gene expression changes with donor age and sex in over half a million cells,” says Rachel Edgar, a postdoctoral fellow at the University Health Network who used the algorithm to aid with the mapping of the Human Liver Cell Atlas. “This project is very collaborative; we needed to regularly update our models to incorporate collaborator feedback. FLASH-MM made this process smooth and efficient, which was impressive given the scale of the atlas.”
The Bader Lab team involved with the project included Xu and Pouyabahar, bioinformatician Veronique Voisin, and data scientist Hamed Heydari. According to Xu, it took the team about one year to fully develop FLASH-MM.
“This was a great collaborative experience,” says Pouyabahar, “Changjiang brought deep statistical expertise, and our skillsets were highly complementary, which made for a strong collaboration.”
The paper, “FLASH-MM: fast and scalable single-cell differential expression analysis using linear mixed-effects models”, is available to be read now.
The Donnelly Centre for Cellular and Biomolecular Research is a research hub at the University of Toronto’s Temerty Faculty of Medicine, where scientists from diverse fields work together to advance medicine and health. Founded in 2005, the Donnelly Centre is a global leader in research on systems biology, regenerative medicine and disease modelling.
For more information about the Donnelly Centre, follow us on X, LinkedIn and Bluesky.
Kira Belaoussoff
Communications Coordinator at the Donnelly Centre
donnelly.communications@utoronto.ca
416-946-8253