Jernej Jevšenak, Sašo Džeroski, Saša Zavadlav, Tom Levanič
Tree-Ring Research 74 (2), 210-224, (1 July 2018) https://doi.org/10.3959/1536-1098-74.2.210
KEYWORDS: multiple linear regression, machine learning, Random Forests, bagging, model trees, artificial neural networks, dendroclimatology
Machine learning (ML) is a widely unexplored field in dendroclimatology, but it is a powerful tool that might improve the accuracy of climate reconstructions. In this paper, different ML algorithms are compared to climate reconstruction from tree-ring proxies. The algorithms considered are multiple linear regression (MLR), artificial neural networks (ANN), model trees (MT), bagging of model trees (BMT), and random forests of regression trees (RF). April-May mean temperature at a Quercus robur stand in Slovenia is predicted with mean vessel area (MVA, correlation coefficient with April-May mean temperature, r = 0.70, p < 0.001) and earlywood width (EW, r = –0.28, p < 0.05). Similarly, June-August mean temperature is predicted with stable carbon isotope (δ13C, r = 0.72, p < 0.001), stable oxygen isotope (δ18O, r = 0.32, p < 0.05) and tree-ring width (TRW, r = 0.11, p > 0.05 (ns)) chronologies. The predictive performance of ML algorithms was estimated by 3-fold cross-validation repeated 100 times. In both spring and summer temperature models, BMT performed best respectively in 62% and 52% of the 100 repetitions. The second-best method was ANN. Although BMT gave the best validation results, the differences in the models’ performances were minor. We therefore recommend always comparing different ML regression techniques and selecting the optimal one for applications in dendroclimatology.