DOI: 10.1515/ms-2017-0177 Math. Slovaca 68 (2018), No. 5, 1149–1172 APPROXIMATION OF INFORMATION DIVERGENCES FOR STATISTICAL LEARNING WITH APPLICATIONS Milan Stehl´ ık* — J´ an Somorˇ ık** — Luboˇ s Stˇ relec*** — Jarom´ ır Antoch**** (Communicated by Gejza Wimmer ) ABSTRACT. In this paper we give a partial response to one of the most important statistical ques- tions, namely, what optimal statistical decisions are and how they are related to (statistical) informa- tion theory. We exemplify the necessity of understanding the structure of information divergences and their approximations, which may in particular be understood through deconvolution. Deconvolution of information divergences is illustrated in the exponential family of distributions, leading to the op- timal tests in the Bahadur sense. We provide a new approximation of I -divergences using the Fourier transformation, saddle point approximation, and uniform convergence of the Euler polygons. Uniform approximation of deconvoluted parts of I -divergences is also discussed. Our approach is illustrated on a real data example. c 2018 Mathematical Institute Slovak Academy of Sciences 1. Introduction It is well known that one of the most important statistical applications of information theory is testing of statistical hypotheses, and that deconvolution of information divergences can lead to the optimal statistical inference. We illustrate this fact by the deconvolution of information divergence in the exponential family (see [12] for details), which results in tests optimal in the Bahadur sense. Let us consider a statistical model with N independent observations y 1 ,...,y N , which are distributed according to the gamma densities f ( y i | ϑ ) = γ i (ϑ) vi Γ(v i ) y vi1 i exp γ i (ϑ)y i ,y i > 0, 0, y i 0, (1.1) where Γ(t)= 0 x t1 e x dx denotes the Gamma function, ϑ =(ϑ 1 ,...,ϑ p ) T Θ is a vector of unknown scale parameters, which are the parameters of interest, and v =(v 1 ,...,v N ) T is a vector of known shape parameters. The parameter space Θ is an open subset of R p , γ i C 2 (Θ) and 2010 M a t h e m a t i c s S u b j e c t C l a s s i f i c a t i o n: Primary 62E17, 62F03; Secondary 65L20, 33E30. K e y w o r d s: deconvolution, information divergence, likelihood, change in intensity of Poisson process. We would like to extend our gratitude for the support from Fondecyt Proyecto Regular No. 1151441 and LIT- 2016-1-SEE-023. This work was also supported by Grants P403/15/09663S and GA16-07089S of the Czech Science Foundation, and grant VEGA No. 2/0047/15. Support from the BELSPO IAP P7/06 StUDyS network is also prominently acknowledged. The authors are very grateful to the Editor, Associate Editor and anonymous referees for their valuable comments and extremely careful reading. 1149