Deep learning for entity analysis : a thesis submitted in partial fulfilment for the degree of Doctor of Philosophy in Computer Science at the School of Natural and Computational Sciences, Massey University, Albany, New Zealand

Hou, Feng

Deep learning for entity analysis : a thesis submitted in partial fulfilment for the degree of Doctor of Philosophy in Computer Science at the School of Natural and Computational Sciences, Massey University, Albany, New Zealand

dc.confidential	Embargo : No	en_US
dc.contributor.advisor	Wang, Ruili
dc.contributor.author	Hou, Feng
dc.date.accessioned	2021-03-01T20:22:56Z
dc.date.accessioned	2021-06-28T20:43:43Z
dc.date.available	2021-03-01T20:22:56Z
dc.date.available	2021-06-28T20:43:43Z
dc.date.issued	2021
dc.description.abstract	Our research focuses on three sub-tasks of entity analysis: fine-grained entity typing (FGET), entity linking and entity coreference resolution. We aim at improving FGET and entity linking by exploiting the document-level type constraints and improving entity linking and coreference resolution by embedding fine-grained entity type information. To extract more efficient feature representations and offset label noises in the datasets for FGET, we propose three transfer learning schemes: (i) transferring sub-word embeddings to generate more efficient out-of-vocabulary (OOV) embeddings for mentions; (ii) using a pre-trained language model to generate more efficient context features; (iii) using a pre-trained topic model to transfer the topic-type relatedness through topic anchors and select confusing fine-grained types at inference time. The pre-trained topic model can offset the label noises without retreating to coarse-grained types. To reduce the distinctiveness of existing entity embeddings and facilitate the learning of contextual commonality for entity linking, we propose a simple yet effective method, FGS2EE, to inject fine-grained semantic information into entity embeddings. FGS2EE first uses the embeddings of semantic type words to generate semantic entity embeddings, and then combines them with existing entity embeddings through linear aggregation. Based on our entity embeddings, we have achieved new state-of-the-art performance on two of the five out-domain test sets for entity linking. Further, we propose a method, DOC-AET, to exploit DOCument-level coherence of named entity mentions and anonymous entity type (AET) words/mentions. We learn embeddings of AET words from the AET words’ inter-paragraph co-occurrence matrix. Then, we build AET entity embeddings and document AET context embeddings using the AET word embeddings. The AET coherence are computed using the AET entity embeddings and document context embeddings. By incorporating such coherence scores, DOC-AET has achieved new state-of-the-art results on three of the five out-domain test sets for entity linking. We also propose LASE (Less Anisotropic Span Embeddings) schemes for coreference resolution. We investigate the effectiveness of these schemes with extensive experiments. Our ablation studies also provide valuable insights about the contextualized representations. In summary, this thesis proposes four deep learning approaches for entity analysis. Extensive experiments show that we have achieved state-of-the-art performance on the three sub-tasks of entity analysis.	en_US
dc.identifier.uri	http://hdl.handle.net/10179/16462
dc.publisher	Massey University	en_US
dc.rights	The Author	en_US
dc.subject	Natural language processing (Computer science)	en
dc.subject	Entity-relationship modeling	en
dc.subject	Machine learning	en
dc.subject.anzsrc	460208 Natural language processing	en
dc.subject.anzsrc	461103 Deep learning	en
dc.title	Deep learning for entity analysis : a thesis submitted in partial fulfilment for the degree of Doctor of Philosophy in Computer Science at the School of Natural and Computational Sciences, Massey University, Albany, New Zealand	en_US
dc.type	Thesis	en_US
massey.contributor.author	Hou, Feng	en_US
thesis.degree.discipline	Computer Science	en_US
thesis.degree.grantor	Massey University	en_US
thesis.degree.level	Doctoral	en_US
thesis.degree.name	Doctor of Philosophy	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: HouPhDThesis.pdf
Size:: 4.97 MB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Theses and Dissertations