Data-intensive modeling of forest dynamics Jean F. Li enard a , Dominique Gravel b , Nikolay S. Strigul a, * a Department of Mathematics, Washington State University, Vancouver, Washington, USA b Departement de Biologie, Universit du Quebec a Rimouski, Quebec, Canada article info Article history: Received 12 January 2015 Accepted 19 January 2015 Available online Keywords: Data-intensive model Forest dynamics Gibbs sampling Markov chain model Markov chain Monte Carlo Patch-mosaic concept Plant population and community dynamics abstract Forest dynamics are highly dimensional phenomena that are not fully understood theoretically. Forest inventory datasets offer unprecedented opportunities to model these dynamics, but they are analytically challenging due to high dimensionality and sampling irregularities across years. We develop a data- intensive methodology for predicting forest stand dynamics using such datasets. Our methodology in- volves the following steps: 1) computing stand level characteristics from individual tree measurements, 2) reducing the characteristic dimensionality through analyses of their correlations, 3) parameterizing transition matrices for each uncorrelated dimension using Gibbs sampling, and 4) deriving predictions of forest developments at different timescales. Applying our methodology to a forest inventory database from Quebec, Canada, we discovered that four uncorrelated dimensions were required to describe the stand structure: the biomass, biodiversity, shade tolerance index and stand age. We were able to suc- cessfully estimate transition matrices for each of these dimensions. The model predicted substantial short-term increases in biomass and longer-term increases in the average age of trees, biodiversity, and shade intolerant species. Using highly dimensional and irregularly sampled forest inventory data, our original data-intensive methodology provides both descriptions of the short-term dynamics as well as predictions of forest development on a longer timescale. This method can be applied in other contexts such as conservation and silviculture, and can be delivered as an efcient tool for sustainable forest management. © 2015 Elsevier Ltd. All rights reserved. Software and data availability The software to estimate transition matrices based on forest inventory was implemented by Jean Lienard in R version 2.15.1 (R Core Team, 2012) and is attached as a zip le to the submission. The database studied in this paper is available upon request to the Quebec provincial forest inventory database (http://www.mffp. gouv.qc.ca/forets/inventaire/). Straightforward modications of the software allow to use with the USDA Forest Inventory and Analysis program (http://www.a.fs.fed.us/). 1. Introduction Forest ecosystems are complex adaptive systems with hierar- chical structures resulting from self-organization in multiple di- mensions simultaneously (Levin, 1999). The patch-mosaic concept was actively developed in the second half of the twentieth century after Watt (1947) suggested that ecological systems can be considered a collection of patches at different successional stages. Dynamical equilibria arise at the level of the mosaic of patches rather than at the level of one patch. The classic patch-mosaic methodology assumes that patch dynamics can be represented by changes in macroscopic variables characterizing the state of the patch as a function of time (Levin and Paine, 1974). Forest distur- bances are traditionally associated with a loss of biomass; however, Markov chain models based only on biomass do not capture forest succession comprehensively (Strigul et al., 2012). This limitation motivates the need for alternative formulations that are able to consider several forest dimensions instead of only one. Here we develop a novel statistical methodology for estimating transition probability matrices from forest inventory data and generalize classic patch-mosaic framework to multiple uncorre- lated dimensions. In particular, we develop a landscape-scale patch-mosaic model of forest stand dynamics using a Markov chain framework, and validate the model using the Quebec pro- vincial forest inventory data. The novelty of our modeling frame- work lies in the consideration of forest transitions within multiple * Corresponding author. E-mail address: nick.strigul@vancouver.wsu.edu (N.S. Strigul). Contents lists available at ScienceDirect Environmental Modelling & Software journal homepage: www.elsevier.com/locate/envsoft http://dx.doi.org/10.1016/j.envsoft.2015.01.010 1364-8152/© 2015 Elsevier Ltd. All rights reserved. Environmental Modelling & Software 67 (2015) 138e148