Extended Rauch-Tung-Striebel Controller Miroslav Zima, Leopoldo Armesto, Vicent Girb´ es, Antonio Sala and V´ aclav ˇ Sm´ ıdl Abstract— This paper presents a novel controller for nonlin- ear unconstrained systems, coined as Extended Rauch-Tung- Striebel (ERTS) controller. The controller is derived from a general framework based on the duality between optimal control and estimation established by Todorov. The proposed controller uses Rauch-Tung-Striebel smoother that predicts (ﬁlters) future states by linearizing the nonlinear system around predicted states and then applies a backward smoothing. The new controller is applied to solve path following problems of non-holonomic vehicles and compared with the standard LQR controller linearizing the model around the desired trajectory and the iterative LQR (iLQR) controller. The main advantages of ERTS controller with respect to the alternative techniques are good control performance and computational efﬁciency. I. INTRODUCTION This article deals with the duality between optimal control and estimation and particularly with the derivation of a new controller based on the duality for nonlinear systems. The fact, that the covariance matrix of the optimal estimator of a linear system with Gaussian noises and Hessian of the opti- mal cost-to-go of a linear control problem with a quadratic loss evolves in time under similar Riccati-like equations, is known more than ﬁfty years, [5]. Due to this, both solutions (Kalman Filter, KF, and Linear Quadratic Controller, LQR) have the same form and as a consequence an algorithm computing KF can be used as LQR algorithm. This is known as the (Kalman’s) duality between optimal control and esti- mation for linear-Gaussian systems. This interesting property motivated efforts for development of possible extensions to nonlinear systems, nonetheless, straightforward generali- zation on nonlinear cases is not known. Theoretical work introducing satisfactory extension was done in [13], where the general (Todorov’s) duality between optimal control and estimation is obtained for slightly reformulated optimal control problem based on Kullback-Leibler divergence. The new general duality applied on LQ problem does not give the same algortihm as the Kalman’s approach. It is because the estimation problem is different in both approaches – prediction in Kalman’s case and smoothing in Todorov’s. This work was supported by European Regional Development Fund and Ministry of Education, Youth and Sports of the Czech Republic under project No. CZ.1.05/2.1.00/03.0094: Regional Innovation Centre for Electrical Engineering (RICE) and MAGV Project (PAID-05-10 Program from VIDI UPV), VALi+d Program and PrometeoII/2013/004 (Generalitat Valenciana), project DPI2011-27845-C02-01 from Spanish Government. M. Zima and V. ˇ Sm´ ıdl are with Regional Innovation Centre for Electro- engineering, University of West Bohemia, Plzen, Czech Republic L. Armesto and V. Girbes are with Instituto de Dise˜ no y Fabricacion at Universitat Politecnica de Valencia, Spain A. Sala is with Instituto Universitario de Automatica e Informatica Industrial at Universitat Politecnica de Valencia, Spain This paper proposes a new controller, coined as Extended Rauch-Tung-Striebel (ERTS) controller, derived from the duality between optimal control and estimation. The new proposed controller is based on the solution of the dual estimation problem given by Rauch-Tung-Striebel (RTS) forward-backward smoother. The computed estimate of next state is then used for the computation of the optimal control. This results in an efﬁcient controller with complexity O(N 2 ) in state dimensions. The controller is optimal for LQ systems and the extension to non-linear settings is done by linearizion along predicted trajectory. The performance of ERTS controller is then illustrated on the (unconstrained) path following problem. The goal of path following problem is to track a robot along a desired path by a control law. Due to wide range of straightforward applica- tions, e.g. motion planning [3] [1], parking [4], overtaking and lane changing [10], and vision-based line following [9], this problem has been studied intensively during last years. However, proposed controllers are commonly strictly specialized on particular tasks or contain “artiﬁcial” design parameters (e.g. look-ahead distance) which have to be tuned. The new method is compared with standard LQR approaches and the iterative LQR method (iLQR), [14], obtaining sig- niﬁcant improvement in accuracy and time efﬁciency. The paper is organized as follows: Section II introduces the duality between estimation and control. Section III partic- ularizes the ideas from previous section and presents ERTS controller. The controller is applied in Section IV on the path following problem and its performance is compared with linearized LQR and iLQR controllers. Conclusions are drawn in Section V. II. PRELIMINARIES AND PROBLEM STATEMENT Consider an stochastic nonlinear dynamic system modeled as Markov process with known transition probability depend- ing on the actual state x t and the control action u t x t+1 ∼ p(x t+1 |x t , u t ). (1) For an arbitrary stochastic control given by distribution π t (u t ), the resulting distribution of x t+1 is x t+1 ∼ p π (x t+1 |x t )=  R nu p(x t+1 |x t , u t )π t (u t )du t . (2) Let us consider obtaining a stochastic controller which optimizes the following expected 1 loss J (x 0 , ¯ s 0:N ,π 0:N- 1 )= E  q N (x N , ¯ s N )+ N- 1  t=0 l t (x t , ¯ s t ,π t )  (3) 1 the expectation is taken over realizations of the random variables x 1:N .