To promote more rapid exploration of such proteins, the authors created the unknome database , that assigns to every protein a “knownness” score, reflecting the information in the scientific literature about function, conservation across species, subcellular compartmentalization, and other elements. Based on this system, there are many thousands of proteins whose knownness is near-zero. Proteins from model organisms are included, along with those from the human genome. The database is open to all and is customizable, allowing the user to provide their own weights to different elements, thereby generating their own set of knownness scores to prioritize their own research.
To test the utility of the database, the authors chose 260 genes in humans for which there were comparable genes in flies, and which had knownness scores of 1 or less in both species, indicating that almost nothing was known about them. For many of them, a complete knockout of the gene was incompatible with life in the fly; partial knockdowns or tissue-specific knockdowns led to the discovery that a large fraction contributed to essential functions influencing fertility, development, tissue growth, protein quality control, or stress resistance.
The results suggest that, despite decades of detailed study, there are thousands of fly genes that remain to be understood at even the most basic level, and the same is clearly true for the human genome. “These uncharacterized genes have not deserved their neglect,” Munro said. “Our database provides a powerful, versatile and efficient platform to identify and select important genes of unknown function for analysis, thereby accelerating the closure of the gap in biological knowledge that the unknome represents.”
Munro added: “The role of thousands of human proteins remains unclear and yet research tends to focus on those that are already well understood. To help address this we created an Unknome database that ranks proteins based on how little is known about them, and then performed functional screens on a selection of these mystery proteins to demonstrate how ignorance can drive biological discovery.”
1. Rocha JJ, Jayaram SA, Stevens TJ, et.al. (2023) Functional unknomics: Systematic screening of conserved genes of unknown function. PLoS Biol 21(8): e3002222. https://doi.org/10.1371/journal.pbio.3002222