PHYSICAL REVIEW E VOLUME 51, NUMBER 5 MAY 1995 Symbol sequence statistics in noisy chaotic signal reconstruction X. Z. Tang* and E. R. Tracy Physics Department, College of William and Mary, Williamsburg, Virginia 23185 A. D. Boozer Physics Department, University of Virginia, Charlottesville, Virginia 22901 A. deBrauw Physics Department, Carleton College, Northfield, Minnesota 55057-4025 R. Brown Institute for iYonlinear Science, University of California, San Diego, California 92093-0402 (Received 3 October 1994) A method is discussed for reconstructing chaotic systems from noisy signals using a symbolic ap- proach. The state space of the dynamical system is partitioned into subregions and a symbol is assigned to each subregion. Consequently, an orbit in a continuous state space is converted into a long symbol string. The probabilities of occurrence for difFerent symbol sequences constitute the symbol sequence statistics. The symbol sequence statistics are easily measured from the signal output and are used as the target for reconstruction (i.e. , for assessing the goodness of fit of proposed models). Reliable reconstruc- tions were achieved given a noisy chaotic signal, provided the general class of the model of the underly- ing dynamics is known. Both observational and dynamical noise were considered, and they were not limited to small amplitudes. Substantial noise produces a strong bias in the symbol sequence statistics, but such bias can be tracked and electively eliminated by including the noise characteristics in the mod- el. This is demonstrated by the robust reconstruction of the Henon and Ikeda maps even when the sig- nal to noise ratio is =1. Applications of this method include extracting control parameters for non- linear dynamical systems and nonlinear model evaluation from experimental data. PACS number(s): 05. 45. +b, 05. 40. +j, 02.50. Ph I. INTRODUCTION In this paper we consider the inverse problem of recon- structing a model of a dynamical system from measured time-series data. Such a model might be used for predic- tion, or control, or many other potential applications. Constructing models from time-series data has a long his- tory (see, e. g. , Ref. [l]). However, substantial gaps in modeling capability remain. Here we treat the following situation: the system we wish to model is low dimensional and deterministic. However, no physical system can ever be fully isolated from its environment. Therefore, real systems will always be subject to driving by various forms of dynamical noise. In addition, the observational data we are given may be polluted by substantial amounts of measurement noise. Decomposing a given signal as "chaos plus noise" is not a well-defined procedure without further constraints on the decomposition process. In most physics applications, these constraints are provided by the fact that one often has some knowledge of the class of models to be con- *Present address: Dept. of Applied Physics, Columbia Univer- sity, 500 West 120th Street, New York, NY 10027. tPresent address: Peace Corps, P.O. Box 208, Lilongwe, Malawi. sidered, both for the deterministic part and the noise (e.g. , the noise is white, I/f, shot noise, etc. ). For the decomposition of the signal into chaos plus noise to be useful the deterministic part of the model should have relatively few degrees of freedom and be as simple as pos- sible. This can be made more precise by using, for exam- ple, a minimum description length criteria [2] or some other objective measure to avoid overfitting. In addition, the noise part of the model should be as small in ampli- tude as possible. The long-term goal of the present work is to develop techniques for the analysis and modeling of chaotic sys- tems from short and/or noisy data sets. Our work is model based and not black box based. One could be led to the choice of models either by application of some classification scheme (e.g. , Refs. [3] or [4]) applied to a "mystery" signal, or by appropriate experimental design, or by taking it as a working hypothesis to be tested. We also assume that the desired model is continuous in the state space variables. In principle, the techniques de- scribed here can be used in ab initio reconstruction, but we do not anticipate that this will be their main area of application. In a series of papers (see, e.g. , Ref. [4] and references therein) Crutchfield and co-workers have emphasized the utility of a symbolic approach to the characterization and modeling of nonlinear systems. They show, for example, how to construct an optimal finite-state model using only 1063-651X/95/51(5)/3871(19)/$06. 00 3871 Q~1995 The American Physical Society