https://doi.org/10.1007/s10661-019-7428-x Applying machine learning to forecast daily Ambrosia pollen using environmental and NEXRAD parameters Gebreab K. Zewdie · Xun Liu · Daji Wu · David J. Lary · Estelle Levetin Received: 4 January 2017 / Accepted: 20 March 2019 © Springer Nature Switzerland AG 2019 Abstract Approximately 50 million Americans have allergic diseases. Airborne plant pollen is a significant trigger for several of these allergic diseases. Ambrosia (ragweed) is known for its abundant production of pollen and its potent allergic effect in North Amer- ica. Hence, estimating and predicting the daily atmo- spheric concentration of pollen (ragweed pollen in particular) is useful for both people with allergies and for the health professionals who care for them. In this study, we show that a suite of variables including mete- orological and land surface parameters, as well as next-generation radar (NEXRAD) measurements together with machine learning can be used to esti- This article is part of the Topical Collection on Geospatial Technology in Environmental Health Applications G. K. Zewdie () · X. Liu · D. Wu · D. J. Lary William B. Hanson Center for Space Sciences, The University of Texas at Dallas, Richardson, TX, USA e-mail: gebreab.zewdie@utdallas.edu X. Liu e-mail: xun.liu@utdallas.edu D. Wu e-mail: daji.wu@utdallas.edu D. J. Lary e-mail: david.lary@utdallas.edu E. Levetin The University of Tulsa, Tulsa, OK 74104, USA e-mail: estelle-levetin@utulsa.edu mate successfully the daily pollen concentration. The supervised machine learning approaches we used included random forests, neural networks, and support vector machines. The performance of the training is independently validated using 10% of the data par- titioned using the holdout cross-validation method from the original dataset. The random forests (R=0.61, R 2 =0.37), support vector machines (R=0.51, R 2 =0.26), and neural networks (R=0.46, R 2 =0.21) effectively predicted the daily Ambrosia pollen, where the correlation coefficient (R) and R-squared (R 2 ) values are given in brackets. Three inde- pendent approaches—the random forests, correla- tion coefficients, and interaction information—were employed to rank the relative importance of the avail- able predictors. Keywords Pollen · Machine learning · Environmental parameters · NEXRAD measurements Introduction Pollen is known to be a trigger for allergic diseases, e.g., asthma, hay fever, and allergic rhinitis (Oswalt and Marshall 2008; Howard and Levetin 2014). It is interesting that a variety of non-respiratory issues such as strokes (Low et al. 2006; Matheson et al. 2008), and surprisingly, even suicide and attempted suicide (Postolache et al. 2005; Stickley et al. 2017) Environ Monit Assess (2019) 191(Suppl 2): 261