Physica A 264 (1999) 570–580 Random aggregation models for the formation and evolution of coding and non-coding DNA A. Provata ∗ Institute of Physical Chemistry, NRCPS Demokritos, 15310, Athens, Greece Received 28 July 1998 Abstract A random aggregation model with inux is proposed for the formation of the non-coding DNA regions via random co-aggregation and inux of biological macromolecules such as viruses, par- asite DNA, and replication segments. The constant mixing (transpositions) and inux drives the system in an out-of-equilibrium steady state characterised by a power law size distribution. The model predicts the long range distributions found in the noncoding eucaryotic DNA and explains the observed correlations. For the formation of coding DNA a random closed aggregation model is proposed which predicts short range coding size distributions. The closed aggregation process drives the system in an almost “frozen” stable state which is robust to external perturbations and which is characterised by well dened space and time scales, as observed in coding sequences. c 1999 Elsevier Science B.V. All rights reserved. PACS: 02.50; 05.40; 87.10 Keywords: Power law; Long range correlations; Coding = non-coding DNA sequences; Out-of-equilibrium steady state; Aggregation 1. Introduction Recent studies on the structure of DNA macromolecules have revealed the existence of unexpected long range correlations in the non-coding part of higher eucariotic DNA, while the coding part of most organisms seems to be statistically uncorrelated [1– 7]. The long range correlations were rst demonstrated via the “DNA walk” model proposed by Peng et al. in 1992 [1]. Later, the same conclusions were reached by examining the size distribution of Purine and Pyrimidine clusters in pure coding and non-coding regions of dierent organisms [2,3]. * E-mail: aprovata@limnos.nrcps.ariadne-t.gr. 0378-4371/99/$ – see front matter c 1999 Elsevier Science B.V. All rights reserved. PII: S0378-4371(98)00546-9