Lexicon-based fine-tuning of multilingual language models for low-resource language sentiment analysis
Loading...
Date
2024-04-01
DOI
Open Access Location
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology and Chongqing University of Technology.
Rights
(c) 2024 The Author/s
CC BY 4.0
CC BY 4.0
Abstract
Pre-trained multilingual language models (PMLMs) such as mBERT and XLM-R have shown good cross-lingual transferability. However, they are not specifically trained to capture cross-lingual signals concerning sentiment words. This poses a disadvantage for low-resource languages (LRLs) that are under-represented in these models. To better fine-tune these models for sentiment classification in LRLs, a novel intermediate task fine-tuning (ITFT) technique based on a sentiment lexicon of a high-resource language (HRL) is introduced. The authors experiment with LRLs Sinhala, Tamil and Bengali for a 3-class sentiment classification task and show that this method outperforms vanilla fine-tuning of the PMLM. It also outperforms or is on-par with basic ITFT that relies on an HRL sentiment classification dataset.
Description
Keywords
deep learning, natural languages, natural language processing
Citation
Dhananjaya V, Ranathunga S, Jayasena S. (2024). Lexicon-based fine-tuning of multilingual language models for low-resource language sentiment analysis. CAAI Transactions on Intelligence Technology. Early View.
Collections
Endorsement
Review
Supplemented By
Referenced By
Creative Commons license
Except where otherwised noted, this item's license is described as (c) 2024 The Author/s

