PRZEGLĄD ELEKTROTECHNICZNY, ISSN 0033-2097, R. 90 NR 1/2014 149 Paweł DYMORA, Mirosław MAZUREK, Dominik STRZAŁKA Politechnika Rzeszowska, Zakład Systemów Rozproszonych Long-range dependencies in quick-sort algorithm Streszczenie. Sortowanie jest jednym z najczęstszych wykorzystywanych typów przetwarzania w systemach komputerowych. W prezentowanym podejściu sortowanie będzie rozważane jako wprowadzenie porządku w przetwarzanym zadaniu wejściowym oraz algorytm jako fizyczny system (odpowiedzialny za obliczenia). Zazwyczaj analiza zachowania dowolnego algorytmu jest realizowana w kontekście klasycznej złożoności obliczeniowej. W niniejszej pracy istnienie zależności długoterminowych w dynamice przetwarzania jest wyznaczane w oparciu o współczynnik Hurst’a. (Zależności długoterminowe w algorytmie quick-sort). Abstract. Sorting is one of the most frequently used types of processing in computer systems. In presented approach sorting will be considered as an introduction of order into processed input task and algorithm as a physical system (responsible for computations). This analysis shows how the dependencies in processed tasks can influence the behavior of algorithm (or equivalently Turing machine). Normally, analysis of any algorithm behavior is done in terms of classical computational complexity. In this paper the rate of existence of long-term correlations in processing dynamics is calculated basing on Hurst coefficient. Słowa kluczowe: zależności długoterminowe, algorytm szybkiego sortowania, współczynnik Hurst’a, samopodobieństwo. Keywords: long-range dependence, quick-sort algorithm, Hurst factor, self-similarity. doi:10.12915/pe.2014.01.35 Introduction It is widely believed that computing is a mathematical science. This view is somewhat justified, since the foundation of most of the considerations in computer science is the idea of a Turing machine [1]. This is a purely mathematical concept that was presented as one of the possible answers to the problem posed by David Hilbert (other answers were given by K. Gödel, A. Church and S. Kleene). During years this model was also regarded as a point of reference for many theoretical aspects in computer science [2]. This machine has nothing to do with physical device and description of some of its features and properties is quite difficult in terms of physics, especially the infinite length of tape (i.e. memory) and zero energy consumption during processing. This is a model that is used for algorithmic processing and whenever someone says ‘algorithm’ always has in mind the model of Turing machine. However, on the other hand, it should be also noted that any implementations of this machine (simply: computers) are physical devices, which are subject to many restrictions: tape (memory) always has finite length and also during operation machine consumes energy and produces entropy [3]. Many aspects of the practical applications of the theory developed in computer science are carried out by computer engineering relating to various types of systems: computers, computer networks, software (including operating systems), databases, graphical interfaces, etc. Most of their description is implemented in terms of simple systems, but it seems that modern computer systems have evolved towards increasing software and hardware complexity and their description needs a new perspective: complex systems. From the physical point of view computer processing is done in computer systems (it takes a place in machine) and is a transformation of energy into useful work (implemented calculations, performed algorithms) and entropy (part of the energy that is wasted: for example, heat generated from the fan, but not only - this will be explained a bit later). This problem was pointed out by Charles Bennett, who stated that [3]: computers may be thought of as engines for transforming free energy into waste heat and mathematical work, but somehow his approach was not pursued. In this paper this observation is a main point of reference and as an example of considerations that can be carried the problem of sorting will be presented. In theoretical computer science it is assumed that it does not matter how the implementation of the Turing machine is done (it even may be a steam engine), but it seems that there is a need to introduce among mathematical considerations some important and necessary physical aspects. Justification for this statement can be found among many voices saying that the computer science is not necessary a mathematical science but rather a physical one [2]. The main problem is a fact that the level of usage of various statistical methods and analysis in theoretical considerations is very small; generally it is assumed that mathematical considerations are sufficient. However, some areas of computer science where this isn’t enough can be shown. For example: static and dynamic hazard in combinatorial systems (digital devices) [4], a problem of states encoding in asynchronous sequential circuits (critical and non-critical races) [4], statistical self-similarity that causes limited bandwidth of queues [5], limited scalability of distributed systems [6], communication costs that influence the efficiency of distributed and parallel systems [7], rapid decline in the performance of systems with limited resources [8]. Most of these problems have a physical background and cannot be understand in terms of mathematical (theoretical) approach. Problem of sorting In this paper we present a different perspective on the problem of sorting analysis. It is one of the most important forms of processing in computer systems. According to Donald Knuth words from his book The Art of Computer Programming, even 70% of operations performed by computer system are sorting. This problem is algorithmically closed, i.e., algorithms are as efficient as it is assumed by theoretical considerations (class O (n log n)) and this is one of the first issues raised on algorithms and programming courses. They are used to explain a question: “What is the computational complexity?”. This concept was introduced around 1967 by Hartmanis and Stearns [9], who wanted to show how the solutions of some problems can be difficult and there may be different ways (more or less efficient) to solve computational tasks. Theirs proposed approach made important assumptions that the algorithm performance is measured by [10]: 1) The analysis of the number of dominant operations requests, which may include comparison, arithmetic operations, loop calls, etc. This diversity is justified by the fact that in most types of computers (but not all), each such an operation is performed during the same time. 2) The measure is calculated independently on the input set properties (instances) and the so-called worst case (the largest number of required dominant operations) is taken (in