Computational Modeling of Agglutinative Languages: The Challenge for Southern Bantu Languages

Loading...
Thumbnail Image
Date
2021-02-24
DOI
Open Access Location
Journal Title
Journal ISSN
Volume Title
Publisher
Arusha Linguistics
Rights
The Author(s) CC BY 4.0
Abstract
In computational linguistics, language models are probabilistic models that predict the likelihood of words occurring within specific sentences. They are key components of many natural language processing systems. Traditional full word models do not work well for agglutinative languages. These are languages that have words built out of distinctly identifiable sub-parts that carry specific meanings and functions and can be combined in different ways to form new words. Sub-word language models have been considered to address this problem and have had success with some agglutinative languages. However the existing models do not appear to address the specific ways in which the sentences and words within the Southern Bantu languages, which are agglutinative, are formed. The adoption of sub-word models for these languages has also been low.
Description
Keywords
Citation
Arusha Working Papers in African Linguistics, 2021, 3 (1), pp. 52 - 81 (30)