Real-time GDP nowcasting in New Zealand : an ensemble machine learning approach : a thesis presented for the degree of Master of Philosophy, School of Natural and Computational Sciences, Massey University, New Zealand
Gross Domestic Product (GDP) measures the monetary value of all ﬁnal goods and services
that are produced in a region during a period of time. For most countries, GDP is released
a limited number of times a year and often with a lag. Understanding the current economic
situation, instead of ﬁgures quarters ago, is of vital importance for both policy and private
entrepreneurs. It is crucial to create a live GDP predictor that could Nowcast current GDP
growth rate in the period of government statistics release delay.
The Econometric approach for GDP Nowcasting has dominated the forecasting area for
many years. However, most of the traditional econometric models could only incorporate
a small handful of variables with a linear model structure, which could not meet the
requirement of the “big data” era for a better model prediction ability with a large amount
of unbalanced variables. With the improvement of computation ability and the increment
of high frequency variables, data-driven approaches like Machine Learning Methods have
been applied into Nowcasting area. It does not only show a stronger forecasting ability in
handling large number of predictors but also present a superior robustness for non-linear
data structure. In this research, an Ensemble Method constructed by several Machine
Learning Methods have been generated to provide more timely available GDP ﬁgures in
the period of government statistics release delay.
Having integrated an input dataset with data from multiple data sources such as public
statistical websites, Reserve Bank of NZ and Stats NZ, our cooperators New Zealand
Transport Agency (NZTA) and PayMark, this study is conducted by ﬁrst applying diﬀerent
Machine Learning methods such as Lightgbm, Xgboost, Support Vector Machine, K-
Nearest Neighbors, Ridge Regression, Lasso, Adaboost models. Then these algorithms
are combined to generate an Ensemble Model with the assistance of an averaging method,
which weights each model individually based on its historical prediction accuracy. The
result of the ﬁnal Ensemble Model is compared with the most commonly used benchmark
ARIMA model and Random Walk model in terms of Mean Square Error (MSE) and
Median Absolute Error(MAE) value. Statistical tests, such as Friedman Test and Wilcoxon
Signed-Rank Test, are employed to check the signiﬁcance of model superiority.
The results indicate that the Ensemble Model signiﬁcantly outperforms individual
Machine Learning algorithm and Random Walk model in forecasting accuracy. When
compared with the ARIMA model, it shows slightly better prediction ability with more
fore-sights especially in a ﬂuctuating environment.