An analysis of the missing data methodology for different types of data : a thesis presented in partial fulfilment of the requirements for the degree of Master of Applied Statistics at Massey University

dc.contributor.authorScheffer, Judith-Anne
dc.date.accessioned2016-05-23T02:22:50Z
dc.date.available2016-05-23T02:22:50Z
dc.date.issued2000
dc.description.abstractMissing data is an eternal problem in data analysis. It is widely recognised that data is costly to collect, and the methods used to deal with missing data in the past relied on case deletion. There is no one overall best fix, but many different methodologies to use in different situations. This study was motivated by the writer's time spent analysing data in the nutrition study, and realising how much data was wasted by case deletion, and subsequently how this could bias inferences formed from the results. A better method (or methods), of dealing with missing data (than case deletion) is required, to ensure valuable information is not lost. What is being done: What is in the literature? The literature on this topic has exploded with new methods in recent times. Algorithms have been written and incorporated based on these methods into a number of statistical packages and add-on libraries. Statistical packages are also reviewed for their practicality and application in this area. The nutrition data is then applied to different methodologies, and software packages to assess different types of imputation. A set of questions are posed; based on type of data, type of missingness, extent of missingness, the required end use of the data, the size of the dataset, and how extensive that analysis needs to be. This can guide the investigator into using an appropriate form of imputation for the type of data at hand. A comparison of imputation methods and results is given with the principal result that imputing missing data is a very worthwhile exercise to reduce bias in survey results, which can be achieved by any researcher analysing their own data. Further to this, a conjecture is given for using Data Augmentation for ordinal data, particularly Likert scales. Previously this has been restricted to either person or item mean imputation, or hot deck methods. Using model based methods for imputation is far superior for other types of data. Model based methods for Likert data are achieved by means of inserting the linear by linear association model into standard missing data methodology.en_US
dc.identifier.urihttp://hdl.handle.net/10179/7862
dc.language.isoenen_US
dc.publisherMassey Universityen_US
dc.rightsThe Authoren_US
dc.subjectMathematical statisticsen_US
dc.subjectMissing observations (Statistics)en_US
dc.subjectMethodologyen_US
dc.titleAn analysis of the missing data methodology for different types of data : a thesis presented in partial fulfilment of the requirements for the degree of Master of Applied Statistics at Massey Universityen_US
dc.typeThesisen_US
massey.contributor.authorScheffer, Judith-Anneen_US
thesis.degree.disciplineApplied Statisticsen_US
thesis.degree.grantorMassey Universityen_US
thesis.degree.levelMastersen_US
thesis.degree.nameMaster of Applied Statistics (M. Appl. Stat.)en_US
Files
Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
01_front.pdf
Size:
990.01 KB
Format:
Adobe Portable Document Format
Description:
Loading...
Thumbnail Image
Name:
02_whole.pdf
Size:
24.83 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
804 B
Format:
Item-specific license agreed upon to submission
Description: