Genetic risk scores can reveal hidden DNA information

Researchers have found that polygenic risk scores, which summarize a person's likelihood of developing diseases like diabetes and cancer, can be reverse-engineered to uncover underlying genetic data. This vulnerability raises privacy concerns, potentially allowing identification through public databases or reconstruction by insurers. The discovery highlights risks in sharing such scores, even anonymously.

Polygenic risk scores (PRS) aggregate the effects of numerous single-nucleotide polymorphisms (SNPs) in the genome to estimate disease predispositions. Companies like 23andMe and researchers use these scores to outline health risks, and individuals sometimes share them publicly for interpretation advice.

Traditionally viewed as low-risk for privacy due to the computational complexity of the knapsack problem—akin to deducing a phone number from its digits' sum—PRS are now shown to be exploitable. The key lies in the precise weights, up to 16 digits long, assigned to each SNP's contribution to disease risk, particularly in smaller models.

Gamze Gürsoy at Columbia University in New York explained: “Because the final polygenic risk score is constrained by a finite number of ways you could arrive at that number, and a statistically likely arrangement of the underlying SNPs, it can be deduced with a high degree of accuracy.” Alongside Kirill Nikitin, Gürsoy tested 298 PRS models using 50 or fewer SNPs on genetic data from 2353 individuals. By calculating possible genomes and filtering improbable mutations, they daisy-chained attacks across models, achieving 94.6 percent accuracy in reconstructing genotypes and predicting 2450 SNPs per person.

Notably, just 27 SNPs sufficed to identify someone in a database of 500,000 samples, with up to 90 percent precision for relatives. Individuals of African and East Asian descent faced higher identification risks due to underrepresentation in genetic databases. Gürsoy noted that 447 small, high-precision models in a public database are vulnerable.

“We wanted to point out that the risk is low, but under [some conditions], there might still be some leakage,” Gürsoy said, urging caution in research designs involving vulnerable groups. Ying Wang at Massachusetts General Hospital acknowledged existing data protections and computational limits but recommended treating small models as sensitive in clinical contexts and consent processes.

The findings stem from a preprint on bioRxiv (DOI: 10.64898/2026.02.16.706191).

相关文章

Researchers at Northwestern Medicine developing an integrated genomic risk score to predict heart rhythm risks, shown working in a lab with genetic data and heart monitors.
AI 生成的图像

Northwestern Medicine develops genetic test for heart rhythm risks

由 AI 报道 AI 生成的图像 事实核查

Researchers at Northwestern Medicine created an integrated genomic risk score that aims to predict dangerous heart rhythms early by combining rare‑variant, polygenic and whole‑genome data. The peer‑reviewed study in Cell Reports Medicine analyzed 1,119 people.

A new book by bioethicist Daphne O. Martschenko and sociologist Sam Trejo explores the implications of polygenic scores in genetic testing, highlighting potential inequalities and myths surrounding genetics. Through their 'adversarial collaboration,' the authors debate whether such research can promote equity or entrench social divides. They call for stricter regulation to ensure responsible use.

由 AI 报道 事实核查

Researchers have developed a genomic mapping technique that reveals how thousands of genes work together to influence disease risk, helping to bridge gaps left by traditional genetic studies. The approach, described in a Nature paper led by Gladstone Institutes and Stanford University scientists, combines large-scale cell experiments with population genetics data to highlight promising targets for future therapies and deepen understanding of conditions such as blood disorders and immune-mediated diseases.

A new study finds that people over 80 who maintain sharp mental abilities, known as super agers, carry fewer copies of the main Alzheimer's risk gene and more of a protective variant. This genetic profile sets them apart even from other healthy seniors in the same age group. The research, led by Vanderbilt University Medical Center, highlights potential resilience factors against dementia.

由 AI 报道 事实核查

Researchers at McMaster University and the Population Health Research Institute report that simple retinal scans, combined with genetic and blood data, may offer a non-invasive window into cardiovascular health and biological aging. An analysis of more than 74,000 people linked simpler eye-vessel patterns to higher heart-disease risk and faster aging. The study, published October 24, 2025, in Science Advances, points to potential early-detection tools that remain under investigation.

A large-scale study reveals that about one in ten people carry genetic variants making them more vulnerable to severe effects from the Epstein-Barr virus, which infects over 90 percent of the population. These variants are linked to higher viral persistence and increased risks of autoimmune diseases like multiple sclerosis and lupus. The findings, based on over 735,000 genomes, suggest pathways for targeted treatments and vaccines.

由 AI 报道 事实核查

Animals across pets, livestock, wildlife and aquaculture are increasingly affected by chronic illnesses long associated with people. A Risk Analysis paper led by the Agricultural University of Athens outlines an integrated model to monitor and manage these conditions across species.

 

 

 

此网站使用 cookie

我们使用 cookie 进行分析以改进我们的网站。阅读我们的 隐私政策 以获取更多信息。
拒绝