Lexicon-based fine-tuning of multilingual language models for low-resource language sentiment analysis
dc.citation.volume | Early View | |
dc.contributor.author | Dhananjaya V | |
dc.contributor.author | Ranathunga S | |
dc.contributor.author | Jayasena S | |
dc.date.accessioned | 2024-10-23T02:03:33Z | |
dc.date.available | 2024-10-23T02:03:33Z | |
dc.date.issued | 2024-04-01 | |
dc.description.abstract | Pre-trained multilingual language models (PMLMs) such as mBERT and XLM-R have shown good cross-lingual transferability. However, they are not specifically trained to capture cross-lingual signals concerning sentiment words. This poses a disadvantage for low-resource languages (LRLs) that are under-represented in these models. To better fine-tune these models for sentiment classification in LRLs, a novel intermediate task fine-tuning (ITFT) technique based on a sentiment lexicon of a high-resource language (HRL) is introduced. The authors experiment with LRLs Sinhala, Tamil and Bengali for a 3-class sentiment classification task and show that this method outperforms vanilla fine-tuning of the PMLM. It also outperforms or is on-par with basic ITFT that relies on an HRL sentiment classification dataset. | |
dc.description.confidential | false | |
dc.edition.edition | 2024 | |
dc.identifier.citation | Dhananjaya V, Ranathunga S, Jayasena S. (2024). Lexicon-based fine-tuning of multilingual language models for low-resource language sentiment analysis. CAAI Transactions on Intelligence Technology. Early View. | |
dc.identifier.doi | 10.1049/cit2.12333 | |
dc.identifier.eissn | 2468-2322 | |
dc.identifier.elements-type | journal-article | |
dc.identifier.issn | 2468-6557 | |
dc.identifier.uri | https://mro.massey.ac.nz/handle/10179/71828 | |
dc.language | English | |
dc.publisher | John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology and Chongqing University of Technology. | |
dc.publisher.uri | https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/cit2.12333 | |
dc.relation.isPartOf | CAAI Transactions on Intelligence Technology | |
dc.rights | (c) 2024 The Author/s | |
dc.rights | CC BY 4.0 | |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | |
dc.subject | deep learning | |
dc.subject | natural languages | |
dc.subject | natural language processing | |
dc.title | Lexicon-based fine-tuning of multilingual language models for low-resource language sentiment analysis | |
dc.type | Journal article | |
pubs.elements-id | 488633 | |
pubs.organisational-group | Other |