Massey Documents by Type

Permanent URI for this communityhttps://mro.massey.ac.nz/handle/10179/294

Browse

Search Results

Now showing 1 - 3 of 3
  • Item
    Automatically identifying errors in primary level math word problems generated by large language models : a research report submitted to School of Mathematical and Computational Sciences in partial fulfillment of the requirements for the degree of Master of Information Sciences, School of Mathematical and Computational Sciences, Massey University
    (Massey University, 2025) Mai, Zhuonan
    Ensuring the quality of mathematical word problems (MWPs) is essential for primary education. However, large language models (LLMs) struggle with error identification despite excelling in problem-solving. This research evaluates four LLMs – Mixtral-8x7B-Instruct-v0.1(Mixtral-8x7B), Meta-Llama-3.1-8B-Instruct (Llama-3.1 8B), DeepSeek-Math-7B-Instruct (DeepSeek-Math-7B), and Llama-3.2-3B-Instruct (Llama-3.2-3B, for detecting errors in a dataset that was generated by LLMs. This dataset contains 5,098 MWPs from U.S. grades 1–6. A comprehensive framework with 12 error categories is introduced, which goes beyond most categorization schemes used in prior research. By evaluating Zero-Shot (inference without any examples), One-Shot (inference with one example), and Three-Shot (inference with three examples) approaches, as well as fine-tuning, across four models in seven experiments, we found that small-scale model Llama-3.2-3B achieved the finest Zero-Shot accuracy of 90% with minimal resources of 6GB GPU memory, comparable to the larger model Mixtral-8x7B's fine-tuned accuracy rate of 90.62%. However, due to data noise and prompt complexity, fine-tuning yielded negative results, with an average accuracy of 78.48%. The complexity of the prompts reduced accuracy by up to 20% for the Mixtral-8x7B model. Safety biases, particularly in Llama-3.1 8B and Mixtral-8x7B, led to misclassifications when triggering safety words. Our findings highlight the efficacy of small-scale LLMs and concise prompts for educational applications while identifying challenges in fine-tuning and model bias. We propose future research directions that include noise-robust data preprocessing, refined prompt engineering, and adversarial fine-tuning. These approaches aim to enhance the reliability of LLMs in detecting errors in MWPs, thereby ensuring the validity of educational assessments and ultimately contributing to the advancement of high-quality foundational mathematics education.
  • Item
    Cross-lingual learning in low-resource : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science, School of Natural and Computational Sciences, Massey University, Auckland, New Zealand
    (Massey University, 2022) Zhao, Jiawei
    Current machine translation techniques were developed using predominantly rich resource language pairs. However, there is a broader range of languages used in practice around the world. For instance, machine translation between Finnish, Chinese and Russian is still not suitable for high-quality communication. This dissertation focuses on building cross-lingual models to address this issue. I aim to analyse the relationships between embeddings of different languages, especially low-resource languages. I investigate four phenomena that can improve the translation of low-resource languages. The first study concentrates on the non-linearity of cross-lingual word embeddings. Current approaches primarily focus on linear mapping between the word embeddings of different languages. However, those approaches don't seem to work as well with some language pairs, mostly if the two languages belong to different language families, e.g. English and Chinese. I hypothesise that linearity, which is often assumed in the geometric relationship between monolingual word embeddings of different languages, may not hold for all language pairs. I focus on investigating the relationship between word embeddings of languages in different language families. I show that non-linearity can better describe the relationship in those language pairs using multiple datasets. The second study focuses on the unsupervised cross-lingual word embeddings for low-resource languages. Conventional approach to constructing cross-lingual word embeddings requires a large dictionary, which is hard to obtain for low-resource languages. I propose an unsupervised approach to learning cross-lingual word embeddings for low-resource languages. By incorporating kernel canonical correlation analysis, the proposed approach can better learn high-quality cross-lingual word embeddings in an unsupervised scenario. The third study investigates a dictionary augmentation technique for low-resource languages. A key challenge for constructing an accurately augmented dictionary is the high variance issue. I propose a semi-supervised method that can bootstrap a small dictionary into a larger high-quality dictionary. The fourth study concentrates on the data insufficiency issue in speech translation. The lack of training data availability for low-resource languages limits the performance of end-to-end speech translation. I investigate the use of knowledge distillation to transfer knowledge from the machine translation task to the speech translation task and propose a new training methodology. The results and analyses presented in this work show that a wide range of techniques can address issues that arise with low-resource languages in the machine translation field. This dissertation provides a deeper insight into understanding the word representations and structures in low-resource translation and should aid future researchers to better utilise their translation models.
  • Item
    Deep learning for entity analysis : a thesis submitted in partial fulfilment for the degree of Doctor of Philosophy in Computer Science at the School of Natural and Computational Sciences, Massey University, Albany, New Zealand
    (Massey University, 2021) Hou, Feng
    Our research focuses on three sub-tasks of entity analysis: fine-grained entity typing (FGET), entity linking and entity coreference resolution. We aim at improving FGET and entity linking by exploiting the document-level type constraints and improving entity linking and coreference resolution by embedding fine-grained entity type information. To extract more efficient feature representations and offset label noises in the datasets for FGET, we propose three transfer learning schemes: (i) transferring sub-word embeddings to generate more efficient out-of-vocabulary (OOV) embeddings for mentions; (ii) using a pre-trained language model to generate more efficient context features; (iii) using a pre-trained topic model to transfer the topic-type relatedness through topic anchors and select confusing fine-grained types at inference time. The pre-trained topic model can offset the label noises without retreating to coarse-grained types. To reduce the distinctiveness of existing entity embeddings and facilitate the learning of contextual commonality for entity linking, we propose a simple yet effective method, FGS2EE, to inject fine-grained semantic information into entity embeddings. FGS2EE first uses the embeddings of semantic type words to generate semantic entity embeddings, and then combines them with existing entity embeddings through linear aggregation. Based on our entity embeddings, we have achieved new state-of-the-art performance on two of the five out-domain test sets for entity linking. Further, we propose a method, DOC-AET, to exploit DOCument-level coherence of named entity mentions and anonymous entity type (AET) words/mentions. We learn embeddings of AET words from the AET words’ inter-paragraph co-occurrence matrix. Then, we build AET entity embeddings and document AET context embeddings using the AET word embeddings. The AET coherence are computed using the AET entity embeddings and document context embeddings. By incorporating such coherence scores, DOC-AET has achieved new state-of-the-art results on three of the five out-domain test sets for entity linking. We also propose LASE (Less Anisotropic Span Embeddings) schemes for coreference resolution. We investigate the effectiveness of these schemes with extensive experiments. Our ablation studies also provide valuable insights about the contextualized representations. In summary, this thesis proposes four deep learning approaches for entity analysis. Extensive experiments show that we have achieved state-of-the-art performance on the three sub-tasks of entity analysis.