Latest News

Mind the Gap: Whole Genome Sequencing and Phenotype Prediction

By Diagnostics World Staff 

November 14, 2025 | In a study published this week in Nature, researchers from Illumina and The University of Queensland demonstrated the importance of whole-genome sequencing (WGS) to more fully capture the genetics underlying complex human traits and diseases. Across all 34 diseases and traits studied, WGS captured 88% of the genetic signal, based on heritability estimates from family studies. This marks a step toward solving the “missing heritability” problem, the authors posit, resolving the gap between family-based heritability and the heritability estimates made by genome wide association studies. (DOI: 10.1038/s41586-025-09720-6)  

Reconciling Phenotypes and Predictions  

Although GWAS studies have identified thousands of SNPs associated with many traits and diseases, the amount of phenotypic variance explained by GWAS-detected associations remains, for most traits, substantially lower than SNP-based heritability, or the proportion of phenotypic variance explained by SNPs. The gap between GWAS-detected associations and SNP-based heritability has been called “hiding heritability” and is expected to vanish with larger GWAS sample sizes.  

But SNP-based heritability predictions are still distinct from pedigree-based estimates of narrow sense heritability. This gap, known as “still-missing heritability”, has been explained by genetic variation not well tagged by common SNPs (including rare variants and structural variants), shared environmental effects between close relatives, and non-additive genetic effects, which may have inflated estimates of additive genetic variation from pedigree-based studies. 

“Overall, quantifying the contribution of these different factors is crucial for designing optimal experiments to identify causal genetic variation for complex traits and disease,” the authors write.  

The authors sought to address these previous limitations using WGS data from 347,630 unrelated individuals with European ancestry in the UK Biobank to accurately quantify the contribution of coding and non-coding SNPs to the heritability of 34 complex traits and diseases. They then complemented these analyses by conducting GWAS of all phenotypes in the original sample plus all their relatives in the UK Biobank and identifying 886 associations across traits involving rare variants.  

“Our GWAS results indicate that a substantial amount of the still-missing heritability of complex traits is already mappable using the GWAS experimental design applied to WGS data of fewer than 500,000 individuals,” they write.  

The authors show that heritability attributable to additive genetic effects at WGS variants is approximately 88% of that estimated from relatives in the UKB. They show that coding and non-coding genetic variants account for 21% and 79% of the rare-variant WGS-based heritability, respectively. For 15 quantitative traits, the team found no significant difference between WGS-based and pedigree-based estimates, suggesting heritability may no longer be missing for those traits.  

Teams and Tools  

The work was done by researchers affiliated with the Illumina Artificial Intelligence Laboratory, the Institute for Molecular Bioscience, University of Queensland, and others.  

"Population-level genomic datasets like UK Biobank give researchers access to a wealth of data," said Kyle Farh, vice president of Artificial Intelligence at Illumina and co-author of the study in a press release, "Illumina's leading AI software and informatics capabilities enable greater insights from that data to drive precision health care and drug discovery." 

"Quantifying the relative contribution of rare and common variants behind this heritability gap gives researchers better strategies to identify genes to target for drug development and discovery," added Loic Yengo, professor of statistical genomics at The University of Queensland's Institute for Molecular Biosciences, who co-supervised the study, in the same press release.  

This study mainly used WGSs called with DRAGEN 3.7.8. Further analyses used SNP-array genotypes and imputed genotypes from two reference panels: Haplotype Reference Consortium (HRC) plus UK10K and TOPMed. All analyses requiring WGS data or SNP-array data imputed with TOPMed reference panel were performed on the DNA Nexus platform, whereas analyses not requiring individual-level data or not cloud-restricted were performed on local computing clusters. 

The study also revealed a significant correlation between scores from Illumina's PrimateAI-3D and variant effect sizes, underscoring the importance of advanced analysis tools for rare variant interpretation.  

"This study shows how Illumina's whole-genome sequencing, powered by DRAGEN secondary analysis and cutting-edge statistical and deep-learning tools, get more out of large cohort studies," commented Rami Mehio, senior vice president and general manager of BioInsight at Illumina. "Our top-performing WGS reveals much more of the genetic signals underlying common diseases, offering researchers AI driven insights that can predict disease risk and identify drug targets." 

Mehio’s BioInsight group is a recently-launched new business within Illumina, developed to meet industry demand for deeper biologic insights driven by the need of researchers and pharma companies to access and interpret ever larger-scale multiomic data.   

BioInsight’s key focus areas are:  

  • Working with large national initiatives and industry partners to enable large-scale genetic and biological data generation projects.
  • Developing software solutions to analyze multimodal data at population scale.
  • Providing platforms to enable private and secure data access for research and pharmaceutical partners.
  • Developing AI tools through strategic partnerships to build an ecosystem for large-scale multimodal data analysis. 
Load more comments
comment-avatar