Genome complexity: the role of tiny genes (Introduction)

by David Turell @, Tuesday, November 26, 2024, 18:38 (10 hours, 17 minutes ago) @ David Turell

A whole new field:

https://www.science.org/content/article/dark-proteome-survey-reveals-thousands-new-huma...

"...recent tallies have moved even lower, to about 20,000. But a new systematic analysis of what some call the “dark proteome” suggests scientists have missed thousands of nontraditional genes that lurk in previously overlooked stretches of the genome and make smaller than average proteins.

"The newly described genes and their products could upend aspects of human biology and accelerate medical discoveries. For example, one newfound gene makes a miniature protein that appears key to a childhood cancer.

***

"He and his colleagues expanded the standard definition of a gene, typically assumed to consist of a long protein-coding DNA sequence known as an open reading frame (ORF), which has signals telling a cell where to start and stop reading it. A cell transcribes the ORF sequence into messenger RNA, which travels to cellular factories called ribosomes that assemble amino acid sequences into proteins. A typical ORF is also preceded by a snippet of DNA that attracts the proteins needed for the gene to be read. And for most researchers, an ORF qualified as a gene if it encoded a protein with 100 or more amino acids.

"But biologists studying everything from yeast to snakes to humans have recently unearthed a plethora of so-called noncanonical ORFs, which lack those prefatory snippets and are shorter than average. Yet they are often transcribed into RNA, and a method known as ribosomal profiling, or Ribo-Seq, has shown that many of the transcribed RNAs attach to ribosomes, where they may be translated into short amino acid chains—even proteins with less than a dozen amino acids.

***

"So they teamed up with gene annotation specialist Jonathan Mudge of GENCODE, the database of officially recognized genes, and ultimately recruited several dozen other researchers from 20 institutions across four continents to help assess how many human noncanonical ORFs exist.

***

"By 2022, the scientists had tracked down 7264 noncanonical ORFs in the human genome. With the help of the Human Proteome Organization, which seeks to catalog all human proteins, and PeptideAtlas, which compiles mass spectrometry data on proteins, they set out to show that these ORFs make proteins.

"That was a “big challenge,” Youn notes. The consortium scoured PeptideAtlas’s archive of mass spectrometry data for small proteins that matched ORF sequences and sorted through published experiments that cataloged protein fragments detected by the human immune system, a blossoming field called immunopeptidomics. All told, they confirmed that one-quarter of the 7264 noncanonical ORFs they had tallied made proteins, some 3000 in all. (An ORF can be read multiple ways to make more than one protein.)

***

"They also give scientists new biomedical targets for study. Prensner and van Heesch had already begun to follow up on an ORF and its miniprotein they identified early in their dark proteome studies. By using the gene editor CRISPR to introduce mutations in the ORF, they could examine its protein’s importance in cancer cells. Though small, the ORF’s product is essential for the survival of medulloblastoma tumors, a brain cancer that affects children,

***

"Although Martinez is pleased with how much of the dark proteome has been uncovered, Youn believes much more remains to be found. The work her team and others have done casts just “slivers of light,” she says, on an unseen population of miniproteins. Her team is refining mass spectrometry techniques to detect ever smaller molecules and hopes to use them to find miniproteins that play a role in brain development.

"Where does all this leave the tally of human genes? The dark proteome has clearly boosted the total, but no one knows the true number.

“'My gut feeling it is probably not as high as 100,000,” Martinez says, “but 50,000 is in the realm of possibility.'”

Comment: it is obvious we are too complicated for just 20,000 genes. This research looks like they have found the answer in ORF's.


Complete thread:

 RSS Feed of thread

powered by my little forum