Investigation of genotype and phenotype interactions using computational statistics : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Statistics at Massey University, New Zealand

Loading...
Thumbnail Image
Date
2021
DOI
Open Access Location
Journal Title
Journal ISSN
Volume Title
Publisher
Massey University
Rights
The Author
Abstract
Deciphering the precise mechanisms by which variations at the DNA level impact measurable characteristics of organisms, coined phenotypes, through the actions of complex molecular networks is a critical topic in modern biology. Such knowledge has implications spanning numerous fields, from plant or animal breeding to medicine. To this end, statistical methods must be leveraged to extract information from molecular measurements of different cellular scales, allowing us to reconstruct the regulatory networks mediating the impact of genotype variations on a phenotype of interest. In this thesis, I investigate the use of causal inference methods, to infer relationships amongst a set of biological entities from observational data. More specifically, I tackled the reconstruction of multi-omics molecular networks linking genotype to phenotype. In the first part, I developed a simulator that generates benchmark gene expression data, i.e. RNA and protein levels, from synthetic gene regulatory networks. The originality of my work is that it includes transcriptional and post-transcriptional regulation amongst genes. I used the developed simulation tool to evaluate and compare the performance of state-of-the-art causal inference methods in reconstructing causal relationships between the genes. The evaluation focused on the ability of the methods to reconstruct relationships mediated by post-transcriptional regulations from observational transcriptomics data. I also evaluated the methods performance to detect different types of causal relationships between genes via a catalogue of causal queries, and highlighted the shortcomings associated with using transcriptomics data alone in reconstructing gene regulatory networks. In the second part, I developed an analysis framework to shed light on the biological mechanisms underlying tetraploid potato tuber bruising. I first integrated a GWAS analysis with a differential expression analysis on transcriptomics data, to uncover genomic regions in which variations affect the response of tubers to mechanical bruising. I then used a multi-omics integration tool to jointly analyse genomics, transcriptomics, metabolomics and phenotypic data and to identify molecular features across the omics datasets involved in tuber bruising, including some not identified with traditional differential expression analyses. Finally, I made use of causal inference tools to reconstruct a multi-omics causal network linking these features to decipher the molecular relationships involved in tuber bruising. I used causal queries to extract information from the reconstructed causal networks and interpret the uncovered relationships.
Description
Keywords
Phenotype, Genotype-environment interaction, Statistical methods, Potatoes, Genetics, Potatoes, Wounds and injuries
Citation