A Monte Carlo study of the effect of sample bias on the multinomial logit coefficients : a thesis presented in partial fulfilment of the requirements of the degree of the Master of Business Studies at Massey University
This thesis reports the findings of a Monte Carlo simulation into the effect of sample bias on the parameters of the multinomial logit (MNL) choice model. At issue is the generalisability of parameter estimates obtained from biased samples to the balance of the population. An actual data set of 164 respondents was used to estimate an aggregate model. Using these parameters as the true coefficients of choice behaviour, an unbiased sampling distribution of the MNL parameters was derived by repeatedly fitting aggregate models to artificially generated sets of individual responses. Subsequently, the biased sampling distribution was derived by selectively eliminating those individuals at the tails of the sample distribution based on their correlation with one of the independent variables. The expected values of the biased and unbiased sampling distributions were compared to assess the sensitivity of the model to sample bias. The research found the biased coefficients changed by significantly more than the proportion of individuals removed. However, this sensitivity was predictable as the percentage change in the value of the coefficients was related to the size of the coefficient. It was also found that the coefficients of the unbiased variables were not significantly influenced by bias on another variable. The ratio between the unbiased variables was also maintained. It was concluded that although sensitive to bias, the estimates produced by the MNL model could be modified to reflect the different effect of the bias on the coefficients. Additionally, there was no evidence to suggest that the MNL estimates were not reflecting the effects of interest when calibrated on unbiased samples.