World’s Largest Map of Protein Connections Holds Clues to Health and Disease
The human body is composed of billions of cells, each of which is made and maintained through countless interactions among its molecular parts. But which interactions sustain health and which ones can cause disease when they go awry? The human genome project has provided us with a “parts list” for the cell, but only if we can understand how these parts go together, or interact, can we really begin to understand how the cell works and what goes wrong in disease.
To answer these questions, scientists needed a reference map of interactions—an interactome— between gene-encoded proteins, which make up cells and do most of the work in them.
Almost a decade in the making, the human protein map is now available thanks to a joint effort, involving over 80 researchers in the United States, Canada, Spain, Belgium, France and Israel, led by Marc Vidal, David Hill and Michael Calderwood, at the Center for Cancer Systems Biology (CCSB) at Dana-Farber Cancer Institute, and Frederick Roth, a professor of molecular genetics and computer science at U of T’s Donnelly Centre.
The largest of its kind, the human reference interactome (HuRI) map charts 52,569 interactions between 8,275 human proteins, as described in a study published in Nature.
Humans have about 20,000 protein-coding genes but scientists still know remarkably little about most of the proteins they encode. Fortunately, this information can be gleaned from interaction data thanks to the “guilt by association” principle, according to which two proteins that have similar interacting partners are likely involved in similar biological processes.
“We can use our human interactome map to predict protein function,” says Roth, who’s also Senior Scientist at the Sinai Health System’s Lunenfeld-Tanenbaum Research Institute in Toronto.
“People can look up their favourite protein and get clues about its function from the proteins it interacts with.”
The data are already revealing new cellular roles for human proteins involved in programmed cell death, release of cellular cargo and other essential processes, for example.
And, by integrating protein interaction data with tissue-specific gene expression, the teams have been able to identify protein networks behind the development and maintenance of different tissues, revealing new therapeutic targets for diverse diseases.
"People can look up their favourite protein and get clues about its function from the proteins it interacts with" - Professor Frederick Roth
Furthermore, using HuRI as a reference, they were also able to see how disease-causing protein variants bring about network rewiring to reveal molecular mechanisms behind those particular disorders.
“Genome sequencing can identify the variants carried by an individual that make them susceptible to disease, but it doesn’t reveal how the disease is caused,” says Calderwood. “Changes in the interactions of a protein is one possible mechanism of disease, and this map provides a starting point to study the impact of disease associated variants on protein-protein interactions.”
The Toronto and Boston teams previously did two smaller studies mapping a total of ~14,000 protein interactions. Now HuRI has interrogated proteins encoded by nearly all human protein-coding genes and expanded the map four-fold.
To create HuRI, the researchers co-expressed in pairs almost all human proteins in yeast cells. When the two proteins interact, or bind one another, they form a molecular switch which boosts yeast cell growth—a sign that an interaction has occurred.
The team tested all possible pairwise combinations among 17,500 proteins for their ability to interact with each other in three separate versions of a yeast-based assay, each done in triplicate, amounting to a staggering three billion separate tests. The results yielded ~53,000 high-confidence binary interactions between more than 8,000 proteins, which were verified by other methods. The majority of interactions had never been detected before.
Although the largest map of its kind to date, the map remains incomplete, representing between 2-11 per cent of all human protein interactions. Roth said that one reason why many interactions were missed is probably because yeast cells lack certain human-specific molecular factors that are needed for proper protein function.
Despite these limitations, HuRI has more than tripled the number of known interactions between human proteins and will serve as an important resource for the research community. Already 15,000 people have visited the data web portal, which was built by Miles Mee, Mohamed Helmy, and Gary Bader, also a professor in the Donnelly Centre, since HuRI was made available on bioRxiv, an open-source online publisher, in April 2019.
“We already had lots of people download the whole dataset and so I imagine we’ll see the iteration of our previous paper, which has already been cited over 800 times and it is less than a third of the size of HuRI,” says Roth.
The research was primarily supported by the National Institutes of Health’s National Human Genome Research Institute, but had additional support internationally from other sources.