Bridging GWAS to genes: an integrative multi-omics approach using cattle data
Loading...
Date
DOI
Open Access Location
Journal Title
Journal ISSN
Volume Title
Publisher
BioMed Central Ltd
Rights
(c) The author/s
CC BY-NC-ND 4.0
CC BY-NC-ND 4.0
Abstract
Background
Genome-wide association studies (GWASs) have identified thousands of loci for complex traits, but pinpointing causal variants and linking them to target genes remains challenging. Several strategies have been proposed to address these challenges, e.g., comparisons across the genome, using larger and multi-breed datasets, multi-trait analyses, leveraging multi-omics data, etc.
Results
We used a multi-breed dataset of over 81,000 cows from Australia, including Holstein, Jersey, and Australian Red, with phenotypes for milk lactose percentage (LP) and imputed sequence genotypes. LD pruning excluded SNPs with r2 > 0.95. We used BayesR to estimate SNP effects for LP (~ 1.1 million SNPs remained after LD pruning); These SNP effects were used to predict local genomic breeding values (GEBVs) for ~ 400 mammary RNA-sequenced cows from New Zealand. Then, genetic score omics regression (GSOR) was applied to test associations between observed gene expression and local GEBVs, identifying 711 significant genes (FDR ≤ 0.1) out of 12,000 genes expressed in the mammary gland. We developed a window-based test to investigate the significance of colocalization between GSOR results and GWAS summary statistics obtained from an independent study. We found 30 windows containing both GWAS signals and GSOR-significant genes (i.e., 34 genes); this overlap was significantly higher than chance expectation (PFisher = 2.96 × 10⁻⁹). Among the 34 genes analyzed, 20 contributed to the significantly enriched gene ontology term ‘transmembrane transport’ and its child terms (FDR < 0.05). These terms are relevant to the physiology of lactose production in the mammary gland.
Conclusions
We hypothesized that the 20 genes are the most likely causal genes for the trait because: mammary expression of these genes was associated with GEBV for the trait, they were significantly colocalized with GWAS signals, and they were enriched in gene ontology terms relevant to physiology of the trait. Our approach provides strong evidence for causal genes supported by multiple lines of evidence (GWAS, GSOR, and functional enrichment) and demonstrates the power of multi-omics data integration.
Description
Keywords
Citation
Ghoreishifar M, Macleod IM, Nguyen T, Lopdell TJ, Littlejohn MD, Xiang R, Chamberlain AJ, Pryce JE, Goddard ME. (2026). Bridging GWAS to genes: an integrative multi-omics approach using cattle data. BMC Genomics. 27. 1.
Collections
Endorsement
Review
Supplemented By
Referenced By
Creative Commons license
Except where otherwised noted, this item's license is described as (c) The author/s

