Deep learning for low-resource machine translation : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science, School of Mathematical and Computational Sciences, Massey University, Albany, Auckland, New Zealand. EMBARGOED until further notice.
| dc.confidential | Embargo : Yes | |
| dc.contributor.advisor | Wang, Ruili | |
| dc.contributor.author | Gao, Yuan | |
| dc.date.accessioned | 2025-09-01T04:35:14Z | |
| dc.date.available | 2025-09-01T04:35:14Z | |
| dc.date.issued | 2025-09-01 | |
| dc.description | Embargoed until further notice | |
| dc.description.abstract | Machine translation, a key task in natural language processing, aims to automatically translate text from one language to another while preserving semantic integrity. This thesis builds upon existing research and introduces three deep-learning methods to enhance translation performance under low-resource conditions: (i) an effective transfer learning framework that leverages knowledge from high-resource language pairs, (ii) a pre-ordering-aware training method that explicitly utilizes contextualized representations of pre-ordered sentences, and (iii) a data augmentation strategy that expands the training data size. Firstly, we develop a two-step fine-tuning (TSFT) transfer learning framework for low-resource machine translation. Due to the inherent linguistic divergence between languages in parent (high-resource language pairs) and child (low-resource language pairs) translation tasks, the parent model often serves as a suboptimal initialization point for directly fine-tuning the child model. Our TSFT framework addresses this limitation by incorporating a pre-fine-tuning stage that adapts the parent model to the child source language characteristics, improving child model initialization and overall translation quality. Secondly, we propose a training method that enables the model to learn pre-ordering knowledge and encode the word reordering information within the contextualized representation of source sentences. Pre-ordering refers to rearranging source-side words to better align with the target-side word order before translation, which helps mitigate word-order differences between languages. Existing methods typically integrate the information of pre-ordered source sentences at the token level, where each token is assigned a local representation that fails to capture broader contextual dependencies. Moreover, these methods still require pre-ordered sentences during inference, which incur additional inference costs. In contrast, our method enables the model to encode the pre-ordering information in the contextualized representations of source sentences. In addition, our method eliminates the need for pre-ordering sentences at inference time while preserving its benefits in improving translation quality. Thirdly, to address data scarcity in low-resource scenarios, we propose a data augmentation strategy that employs high-quality translation models trained bidirectionally on high-resource language pairs. This strategy generates diverse, high-fidelity pseudo-training data through systematic sentence rephrasing, generating multiple target translations for each source sentence.. The increased diversity on the target side enhances the model's robustness, as demonstrated by significant performance improvements in eight pairs of low-resource languages. Finally, we conduct an empirical study to explore the potential of applying ChatGPT for machine translation. We design a set of translation prompts incorporating various auxiliary information to assist ChatGPT in generating translations. Our findings indicate that, with carefully designed prompts, ChatGPT can achieve results comparable to those of commercial translation systems for high-resource languages. Moreover, this study establishes a foundation for future research, offering insights into prompt engineering strategies for leveraging large language models in machine translation tasks. | |
| dc.identifier.uri | https://mro.massey.ac.nz/handle/10179/73455 | |
| dc.publisher | Massey University | |
| dc.rights | © The Author | |
| dc.subject | deep learning, machine translation, low-resource languages | |
| dc.subject | Machine translating | |
| dc.subject | Computer programs | |
| dc.subject | Computational linguistics | |
| dc.subject | Artificial intelligence | |
| dc.subject | Data processing | |
| dc.subject | Generative artificial intelligence | |
| dc.subject.anzsrc | 461103 Deep learning | |
| dc.subject.anzsrc | 470403 Computational linguistics | |
| dc.title | Deep learning for low-resource machine translation : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science, School of Mathematical and Computational Sciences, Massey University, Albany, Auckland, New Zealand. EMBARGOED until further notice. | |
| thesis.degree.discipline | Computer Science (Machine Translation / Natural Language Processing) | |
| thesis.degree.name | Doctor of Philosophy (Ph.D.) | |
| thesis.description.doctoral-citation-abridged | Yuan Gao’s doctoral research advances machine translation for low-resource languages, which often lack the data needed for effective systems. His work introduces new deep-learning approaches and explores the potential of large language models like ChatGPT. This research contributes to bridging linguistic gaps and making translation technology more accessible across diverse languages worldwide. | |
| thesis.description.doctoral-citation-long | Yuan Gao’s doctoral research focuses on advancing machine translation, with a particular emphasis on low-resource languages that lack sufficient data for effective systems. His work develops new deep-learning approaches to improve translation quality and explores the potential of large language models such as ChatGPT for this task. By addressing challenges of data scarcity and linguistic diversity, his research contributes to making translation technology more inclusive and accessible, helping people from different language backgrounds communicate more effectively. | |
| thesis.description.name-pronounciation | Yoo-ahn Gow |
