Background info for Map and Reduce Maarten Fokkinga Version of December 14, 2011, 16:19 1 Intro. What I will discuss falls in the range of Functional Programming up to the Theory of Datatypes, and is meant to be background info for the MapReduce paradigm: Math                             ... Category Theory                 ... more concrete / used in / realized in  legend: more abstract / foundation of / environment for  Theory of Datatypes Algorithmics / Math of Program Construction Functional Programming inspires                                  Imperative Programming                    Haskell MapReduce Paradigm ... Hadoop 2 Notation. Function composition is written as f · g , and application of f to a is written fa or f . a . Application written as a space has the highest priority and application written as a “low dot” has the lowest priority (as suggested by the wide space surrounding the dot). So, f · g . x + y = f (g (x + y )). This convention saves parentheses, thus improving readability. In order to facilitate reasoning in the form of algebraic manipulation (that is, repeatedly replacing a part of an expression by a diﬀerent but semantically equal part) we generally prefer to work on the function level (expressing a function as combination of other functions) instead of the point level (where the outcome of a function application is expressed as a combination of the outcomes of other function applications). Thus we prefer to say, for example, f = f 1 · (+) · (f 2 Δ f 3 ) · f 4 over f (x )= f 1 (f 2 (f 4 (x )) + f 3 (f 4 (x ))), so that the replacement of (f 2 Δ f 3 ) by (f 3 Δ f 2 ) or by f ′ is easier to perform. The list of items a , b , c ,..., in that order, is denoted [a , b , c ,...]. Operation ++ is list concatenation (also called join ), so that [a , b , c ]++[d , e ]=[a , b , c , d , e ]. Function tip is the singleton list former: tip x =[x ]. 1