APPLICATIVE QUERY LANGUAGES Peter T. Breuer Cambridge University Engineering Department Abstract We give a formal calculus, based on higher-order functions and indexed data, for the evaluation of database queries. Simple functions take the role sometimes played by tabular relations in evaluating queries, whilst a set of higher-order generators replaces the query-interpretation mechanism. The calculus may be implemented in any sufficiently powerful programming language and yields a computationally complete database query language. This paper gives an explicit presentation of the ‘transform calculus’. 0. Introduction Functional data models and functional query languages have appeared several times in the literature [Bune79] [Ship81] [Bune82] [Boss84] [Poul88], but formal foundationary theories for either models or queries have been few and far between. Perhaps the best example of a formal theory for a data model is given in [Ullm82] for relational query languages, building on earlier expositions, notably [Codd71] [Codd72], and the depth of the treatment given in all these works has contributed greatly to the confidence with which we use languages based on the relational calculus. Those presentations rely on the formal specification of an underlying algebraic calculus, which may then be amenable to a deep mathematical analysis, and the subject of this paper is essentially the presentation of a data model and a query calculus which are particularly suited for implementation in functional programming languages. Each of the definitions given is short and simple - the length of an appropriate program script in a modern applicative (that is, ‘functional’) language with agreeable syntax (such as Miranda 1 [Turn86]) totals no more than about twenty lines - and the operators themselves have a correspondingly short low-level code representation in these languages, which in turn means that they are fast to execute. These operators do not build tables as intermediates in the computation. We give a set of generators 2 which accept queries as arguments, and return a query as a result. These are the elements of a ‘transform calculus’, ultimately based on an explicitly indexed representation of data. The ‘transform calculus’ itself is a set of algorithms which make new algorithms out of old algorithms, yielding as result an algorithm which may eventually be applied to data as a database query. We may visualize more of the computational effort than usual as going into the algorithm-building part of the process, in keeping with the functional languages which must host the implementations. In contrast, the standard functional database model (c.f. FQL [Bune82]) is based on unindexed data held in unordered lists, and the concept is very like that of tuple data held in an unordered table [Ullm82], except that functions which can derive one table from another fill the role played by the tables themselves in the relational models. This allows attributes to be created on need as ‘virtual’ temporary constructs and avoids the overheads associated with their permanent storage. Because the operational semantics of modern pure functional languages are usually lazy, the potential exists for the automatic self-optimization of executable queries - only the parts of a calculation which are ultimately necessary to the result will be executed, no matter how the query is phrased. This is one of the advantages to be gained from the transfer of database programming to the functional domain, but there are some difficulties associated with the acquisition of data for the computation and the duration for which it should persist in the cache. Nowadays [Atki87] these tend to be resolved by incorporating the entire database store in the area of memory directly available to the functional programming processor. In many ways, however, the design of functional programming database languages has not changed at a deep level during the evolutionary process which has reconciled database 1 Miranda is a trademark of Research Software Ltd. 2 We use the term ‘generator’ rather than ‘combining form’ [Back78]. 1