Massey Documents by Type

Permanent URI for this communityhttps://mro.massey.ac.nz/handle/10179/294

Browse

Search Results

Now showing 1 - 4 of 4
  • Item
    Identifying important microbial and genomic biomarkers for differentiating right- versus left-sided colorectal cancer using random forest models
    (BioMed Central Ltd, 2023-07-11) Kolisnik T; Sulit AK; Schmeier S; Frizelle F; Purcell R; Smith A; Silander O
    BACKGROUND: Colorectal cancer (CRC) is a heterogeneous disease, with subtypes that have different clinical behaviours and subsequent prognoses. There is a growing body of evidence suggesting that right-sided colorectal cancer (RCC) and left-sided colorectal cancer (LCC) also differ in treatment success and patient outcomes. Biomarkers that differentiate between RCC and LCC are not well-established. Here, we apply random forest (RF) machine learning methods to identify genomic or microbial biomarkers that differentiate RCC and LCC. METHODS: RNA-seq expression data for 58,677 coding and non-coding human genes and count data for 28,557 human unmapped reads were obtained from 308 patient CRC tumour samples. We created three RF models for datasets of human genes-only, microbes-only, and genes-and-microbes combined. We used a permutation test to identify features of significant importance. Finally, we used differential expression (DE) and paired Wilcoxon-rank sum tests to associate features with a particular side. RESULTS: RF model accuracy scores were 90%, 70%, and 87% with area under curve (AUC) of 0.9, 0.76, and 0.89 for the human genomic, microbial, and combined feature sets, respectively. 15 features were identified as significant in the model of genes-only, 54 microbes in the model of microbes-only, and 28 genes and 18 microbes in the model with genes-and-microbes combined. PRAC1 expression was the most important feature for differentiating RCC and LCC in the genes-only model, with HOXB13, SPAG16, HOXC4, and RNLS also playing a role. Ruminococcus gnavus and Clostridium acetireducens were the most important in the microbial-only model. MYOM3, HOXC4, Coprococcus eutactus, PRAC1, lncRNA AC012531.25, Ruminococcus gnavus, RNLS, HOXC6, SPAG16 and Fusobacterium nucleatum were most important in the combined model. CONCLUSIONS: Many of the identified genes and microbes among all models have previously established associations with CRC. However, the ability of RF models to account for inter-feature relationships within the underlying decision trees may yield a more sensitive and biologically interconnected set of genomic and microbial biomarkers.
  • Item
    Tracking the international spread of SARS-CoV-2 lineages B.1.1.7 and B.1.351/501Y-V2 with grinch
    (F1000 Research Limited, 2021-09-17) O'Toole Á; Hill V; Pybus OG; Watts A; Bogoch II; Khan K; Messina JP; COVID-19 Genomics UK (COG-UK) consortium; Network for Genomic Surveillance in South Africa (NGS-SA); Brazil-UK CADDE Genomic Network; Tegally H; Lessells RR; Giandhari J; Pillay S; Tumedi KA; Nyepetsi G; Kebabonye M; Matsheka M; Mine M; Tokajian S; Hassan H; Salloum T; Merhi G; Koweyes J; Geoghegan JL; de Ligt J; Ren X; Storey M; Freed NE; Pattabiraman C; Prasad P; Desai AS; Vasanthapuram R; Schulz TF; Steinbrück L; Stadler T; Swiss Viollier Sequencing Consortium; Parisi A; Bianco A; García de Viedma D; Buenestado-Serrano S; Borges V; Isidro J; Duarte S; Gomes JP; Zuckerman NS; Mandelboim M; Mor O; Seemann T; Arnott A; Draper J; Gall M; Rawlinson W; Deveson I; Schlebusch S; McMahon J; Leong L; Lim CK; Chironna M; Loconsole D; Bal A; Josset L; Holmes E; St George K; Lasek-Nesselquist E; Sikkema RS; Oude Munnink B; Koopmans M; Brytting M; Sudha Rani V; Pavani S; Smura T; Heim A; Kurkela S; Umair M; Salman M; Bartolini B; Rueca M; Drosten C; Wolff T; Silander O; Eggink D; Reusken C; Vennema H; Park A; Carrington C; Sahadeo N; Carr M; Gonzalez G; SEARCH Alliance San Diego; National Virus Reference Laboratory; SeqCOVID-Spain; Danish Covid-19 Genome Consortium (DCGC); Communicable Diseases Genomic Network (CDGN); Dutch National SARS-CoV-2 surveillance program; Division of Emerging Infectious Diseases (KDCA); de Oliveira T; Faria N; Rambaut A; Kraemer MUG
    Late in 2020, two genetically-distinct clusters of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) with mutations of biological concern were reported, one in the United Kingdom and one in South Africa. Using a combination of data from routine surveillance, genomic sequencing and international travel we track the international dispersal of lineages B.1.1.7 and B.1.351 (variant 501Y-V2). We account for potential biases in genomic surveillance efforts by including passenger volumes from location of where the lineage was first reported, London and South Africa respectively. Using the software tool grinch (global report investigating novel coronavirus haplotypes), we track the international spread of lineages of concern with automated daily reports, Further, we have built a custom tracking website (cov-lineages.org/global_report.html) which hosts this daily report and will continue to include novel SARS-CoV-2 lineages of concern as they are detected.
  • Item
    Phenotypic and genotypic characterisation of Lactobacillus and yeast isolates from a traditional New Zealand Māori potato starter culture
    (Elsevier BV, 2022-08-26) Sun J; Silander O; Rutherfurd-Markwick K; Wen D; Davy TP-P; Mutukumira AN
    Parāroa Rēwena is a traditional Māori sourdough produced by fermentation using a potato starter culture. The microbial composition of the starter culture is not well characterised, despite the long history of this product. The morphological, physiological, biochemical and genetic tests were conducted to characterise 26 lactic acid bacteria (LAB) and 15 yeast isolates from a Parāroa Rēwena potato starter culture. The results of sugar fermentation tests, API 50 CHL tests, and API ID 32 C tests suggest the presence of four different LAB phenotypes and five different yeast phenotypes. 16S rRNA and 26S rRNA sequencing identified the LAB as Lacticaseibacillus paracasei and the yeast isolates as Saccharomyces cerevisiae, respectively. Multilocus sequence typing (MLST) of the L. paracasei isolates indicated that they had identical genotypes at the MLST loci, to L. paracasei subsp. paracasei IBB 3423 or L. paracasei subsp. paracasei F19. This study provides new insights into the microbial composition of the traditional sourdough Parāroa Rēwena starter culture.
  • Item
    Combining Tn-seq with comparative genomics identifies proteins uniquely essential in Shigella flexneri
    (3/09/2015) Freed NE; Bumann D; Silander O
    Protein functions that are essential for the growth of bacterial pathogens provide promising targets for antibacterial treatment. This is especially true if those functions are uniquely essential for the pathogen, as this might allow the development of targeted antibiotics, i.e. those that disrupt essential functions only for the pathogenic bacteria. Here we present the results of a Tn-seq experiment designed to detect essential protein coding genes in Shigella flexneri 2a 2457T on a genome-wide scale. Our results suggest that 471 protein-coding genes in this organism are critical for cellular growth in rich media. Comparing this set of essential genes (the essential gene complement) with their orthologues in the closely related organism Escherichia coli K12 BW25113 revealed a significant number of genes that are essential in Shigella but not in E. coli, suggesting that the functional correspondence of these proteins had changed. Notably, we also identified a set of functionally related genes that are essential in Shigella but which have no orthologues in E. coli. We found an extreme bias in proteins that have evolved to provide essential functions, with many proteins essential in Shigella but not E. coli, but with none (or very few) being essential in E. coli but not Shigella. We also identify a set- of genes involved in nucleotide biosynthesis that are essential in Shigella, but which lack orthologues in E. coli. Consequently, the data presented here suggest that the essential gene complement can quickly become organism specific, especially for pathogenic organisms whose genomes might have reduced robustness in their metabolic capacity (e.g. functional redundancy), or a reduced numbers of protein coding genes. These results thus open the possibility of developing antibiotic treatments that target differentially essential genes, which may exist even between very closely related strains of bacteria.