Use of prompt-based learning for code-mixed and code-switched text classification

dc.citation.issue5
dc.citation.volume27
dc.contributor.authorUdawatta P
dc.contributor.authorUdayangana I
dc.contributor.authorGamage C
dc.contributor.authorShekhar R
dc.contributor.authorRanathunga S
dc.date.accessioned2024-10-09T00:40:20Z
dc.date.available2024-10-09T00:40:20Z
dc.date.issued2024-09-09
dc.description.abstractCode-mixing and code-switching (CMCS) are prevalent phenomena observed in social media conversations and various other modes of communication. When developing applications such as sentiment analysers and hate-speech detectors that operate on this social media data, CMCS text poses challenges. Recent studies have demonstrated that prompt-based learning of pre-trained language models outperforms full fine-tuning across various tasks. Despite the growing interest in classifying CMCS text, the effectiveness of prompt-based learning for the task remains unexplored. This paper presents an extensive exploration of prompt-based learning for CMCS text classification and the first comprehensive analysis of the impact of the script on classifying CMCS text. Our study reveals that the performance in classifying CMCS text is significantly influenced by the inclusion of multiple scripts and the intensity of code-mixing. In response, we introduce a novel method, Dynamic+AdapterPrompt, which employs distinct models for each script, integrated with adapters. While DynamicPrompt captures the script-specific representation of the text, AdapterPrompt emphasizes capturing the task-oriented functionality. Our experiments on Sinhala-English, Kannada-English, and Hindi-English datasets for sentiment classification, hate-speech detection, and humour detection tasks show that our method outperforms strong fine-tuning baselines and basic prompting strategies.
dc.description.confidentialfalse
dc.identifier.citationUdawatta P, Udayangana I, Gamage C, Shekhar R, Ranathunga S. (2024). Use of prompt-based learning for code-mixed and code-switched text classification. World Wide Web. 27. 5.
dc.identifier.doi10.1007/s11280-024-01302-2
dc.identifier.eissn1573-1413
dc.identifier.elements-typejournal-article
dc.identifier.issn1386-145X
dc.identifier.number63
dc.identifier.urihttps://mro.massey.ac.nz/handle/10179/71646
dc.languageEnglish
dc.publisherSpringer Nature
dc.publisher.urihttps://link.springer.com/article/10.1007/s11280-024-01302-2
dc.relation.isPartOfWorld Wide Web
dc.rights(c) 2024 The Author/s
dc.rightsCC BY 4.0
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectCode-mixing
dc.subjectCode-switching
dc.subjectPrompt-based learning
dc.subjectPre-trained language models
dc.subjectXLM-R
dc.subjectText classification
dc.subjectLanguage script
dc.subjectSinhala
dc.subjectKannada
dc.subjectHindi
dc.titleUse of prompt-based learning for code-mixed and code-switched text classification
dc.typeJournal article
pubs.elements-id491593
pubs.organisational-groupOther
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Published version.pdf
Size:
1.91 MB
Format:
Adobe Portable Document Format
Description:
491593 PDF.pdf
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
9.22 KB
Format:
Plain Text
Description:
Collections