Formalized Metatheory with Terms Represented by an Indexed Family of Types Robin Adams Royal Holloway, University of London robin@cs.rhul.ac.uk Abstract. It is possible to represent the terms of a syntax with binding constructors by a family of types, indexed by the free variables that may occur. This approach has been used several times for the study of syntax and substitution, but never for the formalization of the metatheory of a typing system. We describe a recent formalization of the metatheory of Pure Type Systems in Coq as an example of such a formalization. In general, careful thought is required as to how each definition and theorem should be stated, usually in an unfamiliar ‘big-step’ form; but, once the correct form has been found, the proofs are very elegant and direct. 1 Introduction In [1], Bellegarde and Hook show how the terms of a language with binding constructors can be represented as a nested datatype — a type constructor that takes types (including possibly its own values) as arguments. This idea has since been used several times for the study of the syntax of such languages, for example in Altenkirch and Reus [2], and Bird and Paterson [3]. However, to the best of the author’s knowledge, it has never been used in a formalization of the metatheory of a formal system. We present here a formalization in Coq of the metatheory of Pure Type Systems (PTSs) using this representation for the set of terms. We prove all of the results about arbitrary PTSs given in Barendregt [4], including Subject Reduction and Uniqueness of Types for functional PTSs. The formalization also includes van Bentham Jutting’s proof of Strengthening [5]. There have been several formalizations of the metatheory of formal systems in the past, two of the largest being McKinna and Pollack [6] and Barras [7]; and so we shall be able to compare the strengths and weaknesses of this approach with those of the previous. The indexed family approach proves to have quite limited expressive power. We cannot define all the operations nor state all the results in the form we are used to. Careful thought was often needed as to what form a definition or theorem could take. In general, it was found that operations involving all variables simultaneously were easy to represent in this formalization, while those involving single variables were difficult to represent. For example, we can define the operation of substituting for every variable simultaneously, but not that of