Bridging GWAS to genes: an integrative multi-omics approach using cattle data

Loading...
Thumbnail Image

DOI

Open Access Location

Journal Title

Journal ISSN

Volume Title

Publisher

BioMed Central Ltd

Rights

(c) The author/s
CC BY-NC-ND 4.0

Abstract

Background Genome-wide association studies (GWASs) have identified thousands of loci for complex traits, but pinpointing causal variants and linking them to target genes remains challenging. Several strategies have been proposed to address these challenges, e.g., comparisons across the genome, using larger and multi-breed datasets, multi-trait analyses, leveraging multi-omics data, etc. Results We used a multi-breed dataset of over 81,000 cows from Australia, including Holstein, Jersey, and Australian Red, with phenotypes for milk lactose percentage (LP) and imputed sequence genotypes. LD pruning excluded SNPs with r2 > 0.95. We used BayesR to estimate SNP effects for LP (~ 1.1 million SNPs remained after LD pruning); These SNP effects were used to predict local genomic breeding values (GEBVs) for ~ 400 mammary RNA-sequenced cows from New Zealand. Then, genetic score omics regression (GSOR) was applied to test associations between observed gene expression and local GEBVs, identifying 711 significant genes (FDR ≤ 0.1) out of 12,000 genes expressed in the mammary gland. We developed a window-based test to investigate the significance of colocalization between GSOR results and GWAS summary statistics obtained from an independent study. We found 30 windows containing both GWAS signals and GSOR-significant genes (i.e., 34 genes); this overlap was significantly higher than chance expectation (PFisher = 2.96 × 10⁻⁹). Among the 34 genes analyzed, 20 contributed to the significantly enriched gene ontology term ‘transmembrane transport’ and its child terms (FDR < 0.05). These terms are relevant to the physiology of lactose production in the mammary gland. Conclusions We hypothesized that the 20 genes are the most likely causal genes for the trait because: mammary expression of these genes was associated with GEBV for the trait, they were significantly colocalized with GWAS signals, and they were enriched in gene ontology terms relevant to physiology of the trait. Our approach provides strong evidence for causal genes supported by multiple lines of evidence (GWAS, GSOR, and functional enrichment) and demonstrates the power of multi-omics data integration.

Description

Keywords

Citation

Ghoreishifar M, Macleod IM, Nguyen T, Lopdell TJ, Littlejohn MD, Xiang R, Chamberlain AJ, Pryce JE, Goddard ME. (2026). Bridging GWAS to genes: an integrative multi-omics approach using cattle data. BMC Genomics. 27. 1.

Collections

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license

Except where otherwised noted, this item's license is described as (c) The author/s