- University of Cambridge, Banting Postdoctoral Researcher, 2021-2022.
- Independent Researcher, 2019-2021.
- University of British Columbia, PhD in Medical Genetics, 2012-2019.
- McMaster University, BSc (Hons) in Molecular Biology and Genetics, 2007-2011.
MY RESEARCH OVERVIEW (GO TO SCIENTIFIC OVERVIEW)
‘The Laboratory for RNA-Based Lifeforms’ is an interdisciplinary research group applying state-of-the-art computing to solve biology’s biggest problems.
Planetary-Scale Virus Surveillance Network
DNA and RNA sequencing data is growing exponentially, even outpacing Moore’s Law. Currently, public databases contain 60+ million gigabytes (60 petabytes) of sequencing data from 10+ million samples, and this doubles every 18 months. Samples range from cancer cells in a lab at UofT, to anal swabs of penguins in Antarctica and everything in-between. Along with what researchers intended to study, sequences from the viruses can also be captured, yet go unanalyzed.
At most 0.1% of Earth’s viruses have been identified. To characterize the full diversity of the viruses on Earth, we develop computing algorithms and techniques to analyze sequencing data at the petabyte-scale. In effect, we recycle billions of dollars of data to drive biological discovery. Recently, in one 11-day analysis we discovered 130,000+ new species of RNA viruses, nearly 10x more than were previously known (including nine new species of surprising Coronaviruses). Moving forward we are developing a system to monitor this global-stream of sequencing data to identify where and when pathogens of pandemic potential show up. It is better that we find them, before they find us.
Our research collective focuses on understanding the structure and function of genes through the prism of RNA. Interdisciplinary by design, we complement computational and molecular innovation in the pursuit of fundamental ideas.
Ultra-deep RNA Virus Discovery
The sequence biodiversity of Earth’s RNA virome is enormous and unexplored, at most 0.1% of RNA viruses have been described. We create the computational means for ultra-efficient virus discovery by combining modern informatics and massive (petabyte-scale) data analyses. Together, we are building the digital infrastructure to enable the global surveillance of pathogens of pandemic potential.
A new RNA Genetics
Through illuminating the depths of the “Dark Virome”, we are expanding the known diversity of RNA viruses and virus-like elements, including those thought to be modern remnants of Earth’s most primordial lifeforms. Specifically, we study RNA enzymes, or ribozymes, and structural RNA elements of unknown function. Analogous to the “DNA genetic code” for protein-coding genes, we are learning to read the structural “RNA genetic code” which first evolved in the early RNA World.
Deciphering the ribosome heterogeneity of cancer
The ribosome, itself a catalytic RNA molecule decorated with protein, is central to life as we have come to understand it. Yet the natural and pathogenic (cancer) population genetic variation of ribosomal RNA is poorly understood. We are cataloguing the genetic and epigenetic heterogeneity of ribosomal RNA and delineating its impact on physionormal and diseased translation.
- Petabase-scale sequence alignment catalyses viral discovery. Edgar RC, Taylor J, Lin V, Altman T, Barbera P, Meleshko D, Lohr D, Novakovsky G, Buchfink B, Al-Shayeb B, Banfield JF, de la Peña M, Korobeynikov A, Chikhi R, Babaian A. Nature. 2022 Feb;602(7895):142-147.
- Loss of m1acp3Ψ Ribosomal RNA Modification Is a Major Feature of Cancer. Babaian A, Rothe K, Girodat D, Minia I, Djondovic S, Milek M, Spencer Miko SE, Wieden HJ, Landthaler M, Morin GB, Mager DL. Cell Reports. 2020 May 5;31(5):107611.
- Extant hybrids of RNA viruses and viroid-like elements. Forgia M, Navarro B, Daghino S, Cervera A, Gisel A, Perotto S, Aghayeva DN, Akinyuwa MF, Gobbi E, Zheludev IN, Edgar RC, Chikhi R, Turina M, Babaian A, Di Serio F, de la Peña M. bioRxiv 2022. doi: 10.1101/2022.08.21.504695v1