Data Mining is relatively new in the field of statistics, although widely used elsewhere. Is it a good idea
to discard the model-based methods in favour of Data Driven methods? Data driven methods produce a
high degree of accuracy, but very little interpretability. Model based methods are interpretable, but lack
accuracy. Data mining techniques are commonly used where the data collection has been automated. I
will show these methods are also useful in the large survey setting.
Scheffer, J. (2002), Data mining in the survey setting: why do children go off the rails?, Research Letters in the Information and Mathematical Sciences, 3, 161-189