An empirical analysis of information retrieval based concept location techniques in software comprehension Brendan Cleary & Chris Exton & Jim Buckley & Michael English Published online: 11 November 2008 # Springer Science + Business Media, LLC 2008 Editors: Tim Menzies and Letha Etzkorn Abstract Concept location, the problem of associating human oriented concepts with their counterpart solution domain concepts, is a fundamental problem that lies at the heart of software comprehension. Recent research has attempted to alleviate the impact of the concept location problem through the application of methods drawn from the information retrieval (IR) community. Here we present a new approach based on a complimentary IR method which also has a sound basis in cognitive theory. We compare our approach to related work through an experiment and present our conclusions. This research adapts and expands upon existing language modelling frameworks in IR for use in concept location, in software systems. In doing so it is novel in that it leverages implicit information available in system documentation. Surprisingly, empirical evaluation of this approach showed little performance benefit overall and several possible explanations are forwarded for this finding. Keywords Information retrieval . Software comprehension . Empirical analysis 1 Introduction Software comprehension is widely recognised as one of the most pervasive problems of software engineering (Rajlich and Wilde 2002; Littman et al. 1986; Marcus et al. 2003; Knight and Munro 2002). In the maintenance phase alone, often cited as the most costly Empir Software Eng (2009) 14:93–130 DOI 10.1007/s10664-008-9095-3 B. Cleary (*) : C. Exton : J. Buckley : M. English University of Limerick, Limerick, Ireland e-mail: brendan.cleary@ul.ie C. Exton e-mail: chris.exton@ul.ie J. Buckley e-mail: jim.buckley@ul.ie M. English e-mail: michael.english@ul.ie