Sep 21, 2021

Roth Lab Computational Algorithm VARITY is Best at Pinpointing Disease-Causing Gene Variants

Research

bespectacled man sitting in his pffive with arms crossed

By Soha Usmani

One notable and medically relevant type of genomic variant is missense variants, defined as gene mutations resulting in a change from one amino acid to another. A significant barrier in determining individual risk for genetic disorders is ascribing disease risk to specific variants. This is because over 99% of missense variants in the human population are classified as rare, meaning they have a minor allele frequency (MAF)—defined as the frequency of the second most common allele in a population—of around 0.5%, and 90% are extremely rare, with a MAF less than 10-6. There is a lack of available evidence for the pathogenicity of rare variants compared to common variants, resulting in a need to improve computational methods for inferring disease risk. In their new study, a research effort from the lab of Fritz Roth, a professor of molecular genetics and compter science in the Donnelly Centre for Cellular and Biomolecular Research, led by PhD student Yingzhou (Joe) Wu, aimed to better predict pathogenicity for rare and extremely rare variants by creating a computational algorithm called VARITY, optimized for rare and extremely rare missense variants.

Read more about Professor Roth's work on hunting down harmful variants.

The VARITY model is as follows: the team extracted all missense variants from roughly 18,000 genes and identified around 4,000 disease-associated proteins. They used variants and properties (‘features’) of variants from many databases to train their machine-learning algorithm to classify variants. Although many sources of variant annotation were used, the model was optimized for performance on rare or extremely rare variants with high quality pathogenicity annotations from ClinVar.. After a machine learning step, the researchers analyzed their model. They found that features such as conservation scores, differences in physicochemical properties between the missense and wild-type amino acid, and molecule surface area accessible to solvent were the most critical contributors to predicting variant outcomes. Most importantly, the Roth lab found that the VARITY approach outperforms other computational methods in pinpointing rare pathogenic variants, identifying 12-13% more pathogenic variants than others. Indeed, when tested on de novo missense mutations for neurodevelopmental disorders, VARITY was more sensitive (had higher recall) than all the other algorithms, at a stringent threshold where 90% of predictions were correct. It also surpassed other methods when tested on ClinVar rare variants that had not been used to train the model. Future studies could address VARITY performance improvement by adding features such as inheritance (ex. dominant, recessive) and mechanism (gain or loss of function) to their databases. This model alongside further research into computational predictors will contribute to boosting clinical genetic testing accuracy and giving further insight into genetic disorders and their mechanisms.

Follow us on LinkedIn and Twitter to keep up with Donnelly Centre news.

Nov 4, 2025

New instrumentation at the Donnelly Sequencing Centre allows for a two-week project turnaround

The Donnelly Sequencing Centre team takes pride in two things: A uniquely quick turnaround time, and their ability to take on the difficult projects other sequencing centres refuse. With the addition of two new pieces of sequencing instrumentation, the team expects to serve their clients even better.

Oct 16, 2025

Mimicking the brain’s natural firing patterns could be the next phase of neural mapping

University of Toronto scientists have added their voices to a growing group of researchers who have gleaned insights into brain function with a novel pattern-based approach to the stimulation of neurons in optogenetic research.

Oct 7, 2025

From bench to desk: How the Donnelly Centre’s newest scientist will shape his space

Dr. Khalid Al-Zahrani’s new lab at the University of Toronto will use novel CRISPR applications and functional genomics to identify the genes that drive cancer growth, and develop targeted therapies. His research lab, situated on the third floor of the Donnelly Centre, is empty: the shelves, desks, and benches wait to be occupied. He has plans for every space.

Roth Lab Computational Algorithm VARITY is Best at Pinpointing Disease-Causing Gene Variants

Share This

News

New instrumentation at the Donnelly Sequencing Centre allows for a two-week project turnaround

Mimicking the brain’s natural firing patterns could be the next phase of neural mapping

From bench to desk: How the Donnelly Centre’s newest scientist will shape his space