Bridging GWAS to genes: an integrative multi-omics approach using cattle data

dc.citation.issue1
dc.citation.volume27
dc.contributor.authorGhoreishifar M
dc.contributor.authorMacleod IM
dc.contributor.authorNguyen T
dc.contributor.authorLopdell TJ
dc.contributor.authorLittlejohn MD
dc.contributor.authorXiang R
dc.contributor.authorChamberlain AJ
dc.contributor.authorPryce JE
dc.contributor.authorGoddard ME
dc.date.accessioned2026-02-25T01:39:04Z
dc.date.issued2026-12-01
dc.description.abstractBackground Genome-wide association studies (GWASs) have identified thousands of loci for complex traits, but pinpointing causal variants and linking them to target genes remains challenging. Several strategies have been proposed to address these challenges, e.g., comparisons across the genome, using larger and multi-breed datasets, multi-trait analyses, leveraging multi-omics data, etc. Results We used a multi-breed dataset of over 81,000 cows from Australia, including Holstein, Jersey, and Australian Red, with phenotypes for milk lactose percentage (LP) and imputed sequence genotypes. LD pruning excluded SNPs with r2 > 0.95. We used BayesR to estimate SNP effects for LP (~ 1.1 million SNPs remained after LD pruning); These SNP effects were used to predict local genomic breeding values (GEBVs) for ~ 400 mammary RNA-sequenced cows from New Zealand. Then, genetic score omics regression (GSOR) was applied to test associations between observed gene expression and local GEBVs, identifying 711 significant genes (FDR ≤ 0.1) out of 12,000 genes expressed in the mammary gland. We developed a window-based test to investigate the significance of colocalization between GSOR results and GWAS summary statistics obtained from an independent study. We found 30 windows containing both GWAS signals and GSOR-significant genes (i.e., 34 genes); this overlap was significantly higher than chance expectation (PFisher = 2.96 × 10⁻⁹). Among the 34 genes analyzed, 20 contributed to the significantly enriched gene ontology term ‘transmembrane transport’ and its child terms (FDR < 0.05). These terms are relevant to the physiology of lactose production in the mammary gland. Conclusions We hypothesized that the 20 genes are the most likely causal genes for the trait because: mammary expression of these genes was associated with GEBV for the trait, they were significantly colocalized with GWAS signals, and they were enriched in gene ontology terms relevant to physiology of the trait. Our approach provides strong evidence for causal genes supported by multiple lines of evidence (GWAS, GSOR, and functional enrichment) and demonstrates the power of multi-omics data integration.
dc.description.confidentialfalse
dc.edition.editionDecember 2026
dc.identifier.citationGhoreishifar M, Macleod IM, Nguyen T, Lopdell TJ, Littlejohn MD, Xiang R, Chamberlain AJ, Pryce JE, Goddard ME. (2026). Bridging GWAS to genes: an integrative multi-omics approach using cattle data. BMC Genomics. 27. 1.
dc.identifier.doi10.1186/s12864-026-12525-0
dc.identifier.eissn1471-2164
dc.identifier.elements-typejournal-article
dc.identifier.number171
dc.identifier.piis12864-026-12525-0
dc.identifier.urihttps://mro.massey.ac.nz/handle/10179/74222
dc.languageEnglish
dc.publisherBioMed Central Ltd
dc.publisher.urihttps://link.springer.com/article/10.1186/s12864-026-12525-0
dc.relation.isPartOfBMC Genomics
dc.rights(c) The author/sen
dc.rights.licenseCC BY-NC-ND 4.0en
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/en
dc.titleBridging GWAS to genes: an integrative multi-omics approach using cattle data
dc.typeJournal article
pubs.elements-id609736
pubs.organisational-groupOther

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
609736 PDF.pdf
Size:
2.63 MB
Format:
Adobe Portable Document Format
Description:
Published version.pdf

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
9.22 KB
Format:
Plain Text
Description:

Collections