Development of a reproducible transcriptomics variant calling workflow and its application to colorectal cancer : a thesis presented in partial fulfilment of the requirements for the degree of Master of Science in Genetics at Massey University, Albany, New Zealand

Loading...
Thumbnail Image
Date
2020
DOI
Open Access Location
Journal Title
Journal ISSN
Volume Title
Publisher
Massey University
Rights
The Author
Abstract
Colorectal cancer (CRC) is one of the most common cancers worldwide. It has some of the highest rates in New Zealand, exacerbated by short-comings in available diagnostic tools and survival discrepancies between Maori and non-Maori demographics. In this project, a bioinformatics workflow was developed to make “high confidence” single nucleotide polymorphism (SNP) variant calls from transcriptomics/RNA-seq data. While calling variants from whole genome and exome sequencing is common, standard workflows for calling variants from RNA-seq data do not exist. Here, we aimed to use two common RNA-seq pre-processing methods which we then complemented with an ensemble of variant calling tools, improving confidence in any variants called. We then applied this pipeline to two independent CRC datasets with the hope that those variant calls could improve our understanding of the disease, one of the most significant aggregators of cancer-related mortality. Variant calls were made including those with clinical implications, such as the same KRAS gene variant being called between both geographically distinct populations. Multiple “novel” variants, or those lacking clinically significant annotations, were also obtained for known oncogenic targets (e.g. MAPK1 and AKT1). RNA-seq variant calling remains problematic. The results of this study have provided us with some direction and considerations for future work, such as including normal samples to better distinguish between germline and somatic variants, permit the use of more somatic variant calling tools, etc. Future work is also needed to understand how or if those novel variant calls could improve our understanding of CRC
Description
Appendix 1 is not publically available.
Keywords
Citation