Journal of Intelligent & Fuzzy Systems xx (20xx) x–xx DOI:10.3233/JIFS-179007 IOS Press 1 Prediction of reading difﬁculty in Russian academic texts Valery Solovyev a,∗ , Marina Solnyshkina b , Vladimir Ivanov c and Ildar Batyrshin d a Research and Education Center on Linguistics named after I.A. Boduen de Kurtene, Kazan Federal University, Kazan, Russian Federation b Department of German Philology, Higher School of Russian and Foreign Philology, Kazan Federal University, Russian Federation c Innopolis University, 1, Universitetskaya Str., Innopolis, Russian Federation d Centro de Investigaci´ on en Computaci ´ on, Instituto Polit´ ecnico Nacional, CDMX, Mexico Abstract. Education policy makers view measuring academic texts readability and proﬁling classroom textbooks as a primary task of education management aimed at sustaining quality of reading programs. As Russian readability metrics, i.e. “objective” features of texts determining its complexity for readers, are still a research niche, we undertook a comparative analysis of academic texts features exempliﬁed in textbooks on Social Science and examination texts of Russian as a foreign language. Experiments for 7 classiﬁers and 4 methods of linear regression on Russian Readability corpus demonstrated that ranking textbooks for native speakers is a much more difﬁcult task than ranking examination texts written (or designed) for foreign students. The authors see a possible reason for this in differences between two processes: acquiring a native language on the one hand and learning a foreign language on the other. The results of the current study are extremely relevant in modern Russia which is joining the Bologna Process and needs to provide proﬁled texts for all types of learners and testees. Based on a qualitative and quantitative analysis of a text, the research offers a guide for education managers to help build consensus on selecting a reading material when educators have differing views. Keywords: Text readability, machine learning, Russian academic text, text complexity, examination tests Introduction Modern communication as ’the imparting or exchanging of information by speaking, writing, or using some other medium’ (Oxford English Dictio- nary, 1996) implies either generating or receiving a text, which may be handwritten, printed, electronic or oral. Successful communication in its turn largely depends on whether the amount, content and structure of the quanta of the information sent by its gener- ator in the text and received by the addressee are ∗ Corresponding author. Valery Solovyev, Research and Educa- tion Center on Linguistics named after I.A. Boduen de Kurtene, Kazan Federal University, 18 Kremlyovskaya street, Kazan 420008, Russian Federation. Tel.: +7 843 233 75 12; Fax: +7 843 292 74 18; E-mail: maki.solovyev@mail.ru. similar or in an ideal situation is the same. Thus, for the information of any text (written or oral) to be elicited, processed and stored in the recipient’s mind, it is important that the text itself aligns with the cognitive and linguistic abilities of the recipient. Matching a text to the target audience is a problem rel- evant in a number of spheres: the military, education, PR, advertising, government, business, publishing, medicine and social relations as these are the areas where communication is the foundation of success. The research shows that companies suffer damages and take ﬁnancial hits if the texts to which they expose their customers are hard for the average reader to read [1]. If a text is too easy, i.e. primitive for the audience, readers lose their interest and stop reading. In modern science the problem of text complexity is positioned ISSN 1064-1246/19/$35.00 © 2019 – IOS Press and the authors. All rights reserved Corrected Proof