Bayesian methods to address multiple comparisons and misclassification bias in studies of occupational and environmental risks of cancer : a thesis by publications presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Public Health, Massey University, Wellington, New Zealand
In this thesis I explore the application of several Bayesian approaches, implemented with standard statistical software, in environmental and occupational epidemiology. These methods are applied to case-control studies of occupational risks for lung and upper aerodigestive tract cancers conducted in New Zealand and Europe. The findings are of interest in themselves, but the focus of the thesis is on the application of Bayesian methods to produce these findings. It is not intended to represent a comprehensive overview of all Bayesian methods, but rather to explore Bayesian methods which are most appropriate for the studies which are presented here.
In the first section, I review the underlying theory involved in such analyses.
In the second section, I use Bayesian methods to address the problem of multiple comparisons. In occupational case-control studies, we may collect information on hundreds of occupations/exposures for which there is little or no prior evidence. For those occupations/exposures, we get a false positive finding by chance about 5% of the time. This means that if we repeat the study in a new population, these chance associations are likely to exhibit ‘regression to the mean’ and will not show such extreme risks again. Bayesian methods can be used to ‘shrink’ effect estimates based on how strong the regression to the mean is likely to be.
In the third section, I use Bayesian methods for assessing and correcting systematic error. Although the methods I use can be applied to several situations (selection bias, misclassification, residual confounding), I apply them to the specific situation of
misclassification of the main exposure. In particular, I apply four different methods for such sensitivity analyses: multiple imputation for measurement error (MIME); imputation based on specifying the sensitivity and specificity (SS), Direct Imputation (DI) of the ‘true’ exposure using a regression model for the predictive values and imputation based on a fully Bayesian analysis.
I conclude by summarising the strengths, limitations, and areas of future development for the use of these methods. It is anticipated that, in 5-10 years time, such analyses may become standard supplements to ‘traditional’ forms of analysis, i.e. that Bayesian methods may be routinely used, and may form part of the ‘epidemiological toolkit’ for assessing and correcting for both random and systematic error.