Massey Documents by Type

Permanent URI for this communityhttps://mro.massey.ac.nz/handle/10179/294

Browse

Search Results

Now showing 1 - 3 of 3
  • Item
    Mitigating cognitive biases in developing AI-assisted recruitment systems: A knowledge-sharing approach
    (IGI Global, 2022) Soleimani M; Intezari A; Pauleen DJ
    Artificial intelligence (AI) is increasingly embedded in business processes, including the human resource (HR) recruitment process. While AI can expedite the recruitment process, evidence from the industry, however, shows that AI-recruitment systems (AIRS) may fail to achieve unbiased decisions about applicants. There are risks of encoding biases in the datasets and algorithms of AI which lead AIRS to replicate and amplify human biases. To develop less biased AIRS, collaboration between HR managers and AI developers for training algorithms and exploring algorithmic biases is vital. Using an exploratory research design, 35 HR managers and AI developers globally were interviewed to understand the role of knowledge sharing during their collaboration in mitigating biases in AIRS. The findings show that knowledge sharing can help to mitigate biases in AIRS by informing data labeling, understanding job functions, and improving the machine learning model. Theoretical contributions and practical implications are suggested.
  • Item
    Evolutionary analyses of large data sets : trees and beyond : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Mathematics at Massey University
    (Massey University, 2001) Holland, Barbara Ruth
    The increasing amount of molecular data available for phylogenetic studies means that larger, often intra-species, data sets are being analysed. Treating such data sets with methods designed for small interspecies data may not be useful. This thesis comprises four projects within the field of phylogenetics that focus on cases where the application of current tree estimation methods is not sufficient to answer the biological questions of interest. A simulation study contrasts the accuracy of several tree estimation methods for a particular class of five-taxon, equal-rate, trees. This study highlights several difficulties with tree estimation, including the fact that some tree topologies produce “misleading" patterns that are incorrectly interpreted; that correction for multiple changes does not always increase accuracy, because of increased variance; and the difficulty of correctly placing outgroup taxa. A mitochondrial DNA data set, containing over 400 modern and ancient Adélie penguin samples, is used to estimate the rate of evolution. Straightforward tree-estimation is unhelpful because the amount of homoplasy in the data makes the construction of a single reliable tree impossible. Instead the data is represented by a network. A method, that extends statistical geometry, assesses whether or not a data set can be well-represented by a tree. The "tree-likeness" of each quartet in the data is evaluated and displayed visually, either for the entire data set or by taxon. This aids in identifying reticulate (or simply noisy) data sets, and also particular taxa that confound tree-like signal. Novel methods are developed that use pairwise dissimilarities between isolates in intra-species microbial data sets, to identify strains that are good representatives of their species or subspecies.
  • Item
    Some applications of statistical phylogenetics : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Biomathematics at Massey University
    (Massey University, 2009) Schliep, Klaus Peter
    The increasing availability of molecular data means that phylogenetic studies nowadays often use datasets which combine a large number of loci for many different species. This leads to a trade-off. On the one hand more complex models are preferred to account for heterogeneity in evolutionary processes. On the other hand simple models that can answer biological questions of interest that are easy to interpret and can be computed in reasonable time are favoured. This thesis focuses on four cases of phylogenetic analysis which arise from this conflict. - It is shown that edge weight estimates can be non-identifiable if the data are simulated under a mixture model. Even if the underlying process is known the estimation and interpretation may be difficult due to the high variance of the parameters of interest. - Partition models are commonly used to account for heterogeneity in data sets. Novel methods are presented here which allow grouping of genes under similar evolutionary constraints. A data set, containing 14 genes of the chloroplast from 19 anciently diverged species is used to find groups of co-evolving genes. The prospects and limitations of such methods are discussed. - Penalised likelihood estimation is a useful tool for improving the performance of models and allowing for variable selection. A novel approach is presented that uses pairwise dissimilarities to visualise the data as a network. It is further shown how penalised likelihood can be used to decrease the variance of parameter estimates for mixture and partition models, allowing a more reliable analysis. Estimates for the variance and the expected number of parameters of penalised likelihood estimates are derived. - Tree shape statistics are used to describe speciation events in macroevolution. A new tree shape statistic is introduced and the biases of different cluster methods on tree shape statistics are discussed.