Base by Base
️ 19: Systematic Identification of Promoter and UTR Variants in Rare Disease Diagnostics
14 May 2025
️ Episode 19: Systematic Identification of Promoter and UTR Variants in Rare Disease Diagnostics In this episode of Base by Base, we delve into Martin-Geary et al.’s (2025) innovative framework for uncovering disease-causing variants within promoters and untranslated regions (UTRs) in individuals with rare disorders. Leveraging genome sequencing data from 8,040 undiagnosed trios in the Genomics England 100,000 Genomes Project, the authors combine precise region definitions based on MANE transcripts and ENCODE candidate cis-regulatory elements with stringent filtering and annotation tools—including VEP with UTRannotator, SpliceAI, CADD, PhyloP, and FABIAN—to prioritize de novo non-coding variants with high likelihood of pathogenicity . Key Highlights:The study defines over 20 million bases of proximal promoter and UTR sequence across 1,567 dominant disease genes and excludes coding regions to focus on regulatory elements; it then filters de novo variants by allele frequency and region overlap, yielding 1,311 candidates prior to annotation . Utilizing annotation thresholds calibrated for non-coding contexts, the pipeline prioritizes eleven de novo variants, nine of which match patients’ phenotypes and include both previously confirmed diagnoses (e.g., PAX6, MEF2C) and novel findings (e.g., SLC2A1, NIPBL, ZBTB18, SETD5, GNAS). Clinical review and functional follow-up, such as RNA-seq and DNA methylation episignatures, validate the diagnostic potential of these variants. Applying the same filters to ClinVar confirms high specificity—prioritizing 53.7% of known pathogenic variants while excluding 99.3% of benign ones—though sensitivity for promoter variants remains an area for improvement. Finally, a burden test across 7,862 probands matched to controls shows no significant enrichment of prioritized promoter or UTR variants, underscoring challenges in detecting aggregate non-coding variant effects at current cohort sizes. Conclusion:Martin-Geary et al.’s framework demonstrates that routine interrogation of promoters and UTRs can yield actionable genetic diagnoses—albeit at modest incremental yield—and provides a reproducible, highly specific pipeline that can be integrated into clinical diagnostic workflows. As our understanding of regulatory genomics deepens and annotation tools improve, such approaches will become increasingly powerful for uncovering hidden causes of rare disease. Reference:Martin-Geary, A. C., Blakes, A. J. M., Dawes, R., et al. (2025). Systematic identification of disease-causing promoter and untranslated region variants in 8040 undiagnosed individuals with rare disease. Genome Medicine, 17, 40. https://doi.org/10.1186/s13073-025-01464-2. License:This episode is based on an open access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) – https://creativecommons.org/licenses/by/4.0/
No persons identified in this episode.
This episode hasn't been transcribed yet
Help us prioritize this episode for transcription by upvoting it.
Popular episodes get transcribed faster
Other recent transcribed episodes
Transcribed and ready to explore now
3ª PARTE | 17 DIC 2025 | EL PARTIDAZO DE COPE
01 Jan 1970
El Partidazo de COPE
Buchladen: Tipps für Weihnachten
20 Dec 2025
eat.READ.sleep. Bücher für dich
LVST 19 de diciembre de 2025
19 Dec 2025
La Venganza Será Terrible (oficial)
Christmas Party, Debris & Ping-Pong
19 Dec 2025
My Therapist Ghosted Me
Episode 1320: Becoming 'The Monk': Rex Ryan on playing Gerry Hutch on stage (Part 1)
19 Dec 2025
Crime World
Friends Thru A Lens: The Holidays with Ella Risbridger
19 Dec 2025
Sentimental Garbage