Development of Natural Language Processing Tools for Cook Islands Māori
Open Access Location
This paper presents three ongoing projects for NLP in Cook Islands Ma ̄ori: Un- trained Forced Alignment (approx. 9% er- ror when detecting the center of words), automatic speech recognition (37% WER in the best trained models) and automatic part-of-speech tagging (92% accuracy for the best performing model). These new re- sources fill existing gaps in NLP for the language, including gold standard POS- tagged written corpora, transcribed speech corpora, and time-aligned corpora down to the phoneme level. These are part of efforts to accelerate the documentation of Cook Islands Ma ̄ori and to increase its vi- tality amongst its users.
Proceedings of Australasian Language Technology Association Workshop, 2018, pp. 26` - 33