An analysis of the missing data methodology for different types of data : a thesis presented in partial fulfilment of the requirements for the degree of Master of Applied Statistics at Massey University

Scheffer, Judith-Anne

An analysis of the missing data methodology for different types of data : a thesis presented in partial fulfilment of the requirements for the degree of Master of Applied Statistics at Massey University

dc.contributor.author	Scheffer, Judith-Anne
dc.date.accessioned	2016-05-23T02:22:50Z
dc.date.available	2016-05-23T02:22:50Z
dc.date.issued	2000
dc.description.abstract	Missing data is an eternal problem in data analysis. It is widely recognised that data is costly to collect, and the methods used to deal with missing data in the past relied on case deletion. There is no one overall best fix, but many different methodologies to use in different situations. This study was motivated by the writer's time spent analysing data in the nutrition study, and realising how much data was wasted by case deletion, and subsequently how this could bias inferences formed from the results. A better method (or methods), of dealing with missing data (than case deletion) is required, to ensure valuable information is not lost. What is being done: What is in the literature? The literature on this topic has exploded with new methods in recent times. Algorithms have been written and incorporated based on these methods into a number of statistical packages and add-on libraries. Statistical packages are also reviewed for their practicality and application in this area. The nutrition data is then applied to different methodologies, and software packages to assess different types of imputation. A set of questions are posed; based on type of data, type of missingness, extent of missingness, the required end use of the data, the size of the dataset, and how extensive that analysis needs to be. This can guide the investigator into using an appropriate form of imputation for the type of data at hand. A comparison of imputation methods and results is given with the principal result that imputing missing data is a very worthwhile exercise to reduce bias in survey results, which can be achieved by any researcher analysing their own data. Further to this, a conjecture is given for using Data Augmentation for ordinal data, particularly Likert scales. Previously this has been restricted to either person or item mean imputation, or hot deck methods. Using model based methods for imputation is far superior for other types of data. Model based methods for Likert data are achieved by means of inserting the linear by linear association model into standard missing data methodology.	en_US
dc.identifier.uri	http://hdl.handle.net/10179/7862
dc.identifier.wikidata	Q112902770
dc.identifier.wikidata-uri	https://www.wikidata.org/wiki/Q112902770
dc.language.iso	en	en_US
dc.publisher	Massey University	en_US
dc.rights	The Author	en_US
dc.subject	Mathematical statistics	en_US
dc.subject	Missing observations (Statistics)	en_US
dc.subject	Methodology	en_US
dc.title	An analysis of the missing data methodology for different types of data : a thesis presented in partial fulfilment of the requirements for the degree of Master of Applied Statistics at Massey University	en_US
dc.type	Thesis	en_US
massey.contributor.author	Scheffer, Judith-Anne	en_US
thesis.degree.discipline	Applied Statistics	en_US
thesis.degree.grantor	Massey University	en_US
thesis.degree.level	Masters	en_US
thesis.degree.name	Master of Applied Statistics (M. Appl. Stat.)	en_US

Files

Original bundle

Now showing 1 - 2 of 2

Name:: 01_front.pdf
Size:: 990.01 KB
Format:: Adobe Portable Document Format
Description:

Download

Name:: 02_whole.pdf
Size:: 24.83 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 804 B
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Theses and Dissertations