Ensembles of neural networks for language modeling : a thesis presented in partial fulfilment of the requirements for the degree of Master of Philosophy in Information Technology at Massey University, Auckland, New Zealand

Xiao, Yujie

Ensembles of neural networks for language modeling : a thesis presented in partial fulfilment of the requirements for the degree of Master of Philosophy in Information Technology at Massey University, Auckland, New Zealand

Files

01_front.pdf(82.5 KB)

02_whole.pdf(1010.16 KB)

Date

2018

Authors

Xiao, Yujie

Publisher

Massey University

Rights

The Author

Abstract

Language modeling has been widely used in the application of natural language processing, and therefore gained a significant amount of following in recent years. The objective of language modeling is to simulate the probability distribution for different linguistic units, e.g., characters, words, phrases and sentences etc, using traditional statistical methods or modern machine learning approach. In this thesis, we first systematically studied the language model, including traditional discrete space based language model and latest continuous space based neural network based language model. Then, we focus on the modern continuous space based language model, which embed elements of language into a continuous-space, aim at finding out a proper word presentation for the given dataset. Mapping the vocabulary space into a continuous space, the deep learning model can predict the possibility of the future words based on the historical presence of vocabulary more efficiently than traditional models. However, they still suffer from various drawbacks, so we studied a series of variants of latest architecture of neural networks and proposed a modified recurrent neural network for language modeling. Experimental results show that our modified model can achieve competitive performance in comparison with existing state-of-the-art models with a significant reduction of the training time.

Keywords

Linguistic models, Data processing, Neural networks (Computer science), Natural language processing (Computer science)

URI

http://hdl.handle.net/10179/15230

Collections

Theses and Dissertations

Full item page