Information Processing and Management 53 (2017) 454–472 Contents lists available at ScienceDirect Information Processing and Management journal homepage: www.elsevier.com/locate/infoproman Exploring the space of information retrieval term scoring functions Parantapa Goswami , Eric Gaussier, Massih-Reza Amini Université Grenoble Alpes, CNRS, LIG/AMA, France a r t i c l e i n f o Article history: Received 7 February 2016 Revised 7 November 2016 Accepted 13 November 2016 Keywords: IR theory Function generation Automatic discovery IR constraints a b s t r a c t In this paper we are interested in finding good IR scoring functions by exploring the space of all possible IR functions. Earlier approaches to do so however only explore a small sub- part of the space, with no control on which part is explored and which is not. We aim here at a more systematic exploration by first defining a grammar to generate possible IR functions up to a certain length (the length being related to the number of elements, variables and operations, involved in a function), and second by relying on IR heuristic constraints to prune the search space and filter out bad scoring functions. The obtained candidate scoring functions are tested on various standard IR collections and several sim- ple but promising functions are identified. We perform extensive experiments to compare these functions with classical IR models. It is observed that these functions are yielding either better or comparable results. We also compare the performance of functions satis- fying IR heuristic constraints and those which do not; the former set of functions clearly outperforms the latter, which shows the validity of IR heuristic constraints to design new IR models. © 2016 Elsevier Ltd. All rights reserved. 1. Introduction The quest for new, high performing IR scoring functions has been one of the main goals of IR research, ever since the beginning of the field in the late forties. This quest has led to many IR models, from the boolean model and the vector space model (Salton & McGill, 1983) to more recent proposals as the language model (Ponte & Croft, 1998) and the relevance model (Lavrenko & Croft, 2003), BM25 (Robertson & Zaragoza, 2009) and more generally probabilistic models (Jones, Walker, & Robertson, 2000a, 2000b), the HMM model (Metzler & Croft, 2005), the divergence from randomness framework (Amati & Rijsbergen, 2002) with the information-based models (Clinchant & Gaussier, 2010), and learning to rank approaches (Liu, 2009). These models originated from the fertile imagination and thinking of scientists, who either relied on first principles, within a given theoretical framework, to derive new scoring functions, or who devised learning procedures to identify the best function in a given family of functions, typically the family of linear functions. The space of possible IR scoring functions is however tremendously larger than the one explored through such a process. Quoting Fan, Gordon, and Pathak, 2004: There is no guarantee that existing ranking functions are the best/optimal ones available. It seems likely that more powerful functions are yet to be discovered. Corresponding author. E-mail addresses: parantapa.goswami@imag.fr (P. Goswami), eric.gaussier@imag.fr (E. Gaussier), massih-reza.amini@imag.fr (M.-R. Amini). http://dx.doi.org/10.1016/j.ipm.2016.11.003 0306-4573/© 2016 Elsevier Ltd. All rights reserved.