Machine Vision and Applications (1997) 9: 304–320 Machine Vision and Applications c Springer-Verlag 1997 A methodology for evaluation of task performance in robotic systems: a case study in vision-based localization P´ eter L. Venetianer, Edward W. Large, Ruzena Bajcsy GRASP Lab, Computer and Information Science Department, University of Pennsylvania, 3401 Walnut 301, Philadelphia, PA 19104-6228, USA venetian@grip.cis.upenn.edu, large@grip.cis.upenn.edu, bajcsy@central.cis.upenn.edu Abstract. We investigated the performance of an agent that uses visual information in a partially unknown and changing environment in a principled way. We propose a methodol- ogy to study and evaluate the performance of autonomous agents. We first analyze the system theoretically to deter- mine the most important system parameters and to predict error bounds and biases. We then conduct an empirical anal- ysis to update and refine the model. The ultimate goal is to develop self-diagnostic procedures. We show that although simple models can successfully predict some major effects, empirically observed performance deviates from theoretical predictions in interesting ways. 1 Introduction In this paper we investigate the performance of an agent which acts using visual information in a partially unknown and changing environment in a principled way. Unlike in the empirical sciences, where experiments are performed to identify mechanisms of task performance, in the engineer- ing sciences artificial agents are designed according to well known principles. Thus we know the contents of our “black boxes” a priori and can predict in detail how the agent will perform in a constrained environment. What is often less clear, however, is how the artificial agent will interact with the (very complex) real world. The environment in which the agent perceives and acts provides uncontrolled sources of variability determining the performance of the agent. Our goal, predicated by good engineering science, is to deter- mine by empirical experiment aspects of agent-environment interaction that affect performance in systematic ways. In so doing, we hope to show the relationship between a de- terministic model of the agent, systematic determinants of performance that arise from the agent-environment interac- tion, and those determinants of performance that are best modeled stochastically. In order to focus on these issues we selected the task of landmark based localization for a mobile agent, the determi- nation of position and orientation relative to a landmark (also known as pose estimation). Within the larger system, local- ization serves two purposes. First, we assume that the envi- ronment contains definite, distinguishable landmarks that the agent can detect. By estimating its location with respect to landmarks the agent can determine its global position using an internal map that indicates the position of each landmark. These landmarks are assumed to be stationary. Second, by identifying landmarks mounted on other mobile agents, each agent can identify others and localize itself with respect to others. In both cases the agent must determine its distance from the landmark and its orientation with respect to the normal of the landmark. This operation does not assume any representation of the environment, nor is it based on poten- tially unreliable odometry readings. However, this strategy can be combined with odometry readings which can reg- ularly be recalibrated using the more precise localization strategy. The localization algorithm used here is not unique or novel, rather it serves as a means of demonstrating our experimental methodology. Our analysis proceeds in two stages. First, we analyze the equations that the agent uses to perform the localization task. This analysis helps us to identify the most important variables in determining the performance of the agent as it interacts with the environment [1]. In addition, the analy- sis makes certain predictions about the performance of the localization algorithm. Next, informed by this analysis, we design an empirical experiment with which to test the per- formance of the agent in a realistic environment. We use an analysis of variance to evaluate our results. By these means we address three issues. First, we attempt to verify that the agent actually performs within predicted error bounds as it interacts with the environment. Second, we determine systematic deviations from predicted performance as we manipulate various factors that are determined by the agent-environment interaction. Finally, we address the ba- sic methodological question of how well we can model the agent-environment interaction based on physics and geome- try, what factors are best explored by empirical experimen- tation, and what must or can be understood or modeled as stochastic processes. The remainder of this paper is organized as follows: Sect. 2 presents the localization algorithm, Sect. 3 addresses