Development of Automatic Speech Recognition for the Documentation of Cook Islands Māori

dc.citation.issue13en_US
dc.contributor.authorCoto-Solano, Ren_US
dc.contributor.authorNicholas, SAen_US
dc.contributor.authorDatta, Sen_US
dc.contributor.authorQuint, Ven_US
dc.contributor.authorWills, Pen_US
dc.contributor.authorPowell, ENen_US
dc.contributor.authorKoka‘ua, Len_US
dc.contributor.authorTanveer, Sen_US
dc.contributor.authorFeldman, Ien_US
dc.coverage.spatialMarseilleen_US
dc.date.accessioned2022-06-30T19:33:07Z
dc.date.available2022-06-20en_US
dc.date.available2022-06-30T19:33:07Z
dc.date.finish-date2022-06-25en_US
dc.date.issued2022-06-20en_US
dc.date.start-date2022-06-21en_US
dc.description.abstractThis paper describes the process of data processing and training of an automatic speech recognition (ASR) system for Cook Islands Māori (CIM), an Indigenous language spoken by approximately 22,000 people in the South Pacific. We transcribed four hours of speech from adults and elderly speakers of the language and prepared two experiments. First, we trained three ASR systems: one statistical, Kaldi; and two based on Deep Learning, DeepSpeech and XLSR-Wav2Vec2. Wav2Vec2 tied with Kaldi for lowest character error rate (CER=6±1) and was slightly behind in word error rate (WER=23±2 versus WER=18±2 for Kaldi). This provides evidence that Deep Learning ASR systems are reaching the performance of statistical methods on small datasets, and that they can work effectively with extremely low-resource Indigenous languages like CIM. In the second experiment we used Wav2Vec2 to train models with held-out speakers. While the performance decreased (CER=15±7, WER=46±16), the system still showed considerable learning. We intend to use ASR to accelerate the documentation of CIM, using newly transcribed texts to improve the ASR and also generate teaching and language revitalization materials. The trained model is available under a license based on the Kaitiakitanga License, which provides for non-commercial use while retaining control of the model by the Indigenous community.en_US
dc.description.confidentialfalseen_US
dc.description.place-of-publicationMarseille, Franceen_US
dc.format.extent3872 - 3882 (11)en_US
dc.identifierhttps://aclanthology.org/2022.lrec-1.412en_US
dc.identifier.citationProceedings of the Language Resources and Evaluation Conference, 2022, (13), pp. 3872 - 3882 (11)en_US
dc.identifier.elements-id453927
dc.identifier.harvestedMassey_Dark
dc.identifier.urihttp://hdl.handle.net/10179/17252
dc.publisherEuropean Language Resources Associationen_US
dc.publisher.urihttps://aclanthology.org/2022.lrec-1.412en_US
dc.relation.isPartOfProceedings of the Language Resources and Evaluation Conferenceen_US
dc.relation.urihttp://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.412.pdfen_US
dc.rightsCC-BY-NC-4.0en_US
dc.sourceLanguage Resources and Evaluation (LREC) Conference 2022en_US
dc.titleDevelopment of Automatic Speech Recognition for the Documentation of Cook Islands Māorien_US
dc.typeConference Paper
pubs.notesNot knownen_US
pubs.organisational-group/Massey University
pubs.organisational-group/Massey University/College of Humanities and Social Sciences
pubs.organisational-group/Massey University/College of Humanities and Social Sciences/School of Humanities, Media & Creative Communication
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2022.lrec-1.412 copy.pdf
Size:
371.3 KB
Format:
Adobe Portable Document Format
Description: