Predictability of Off-line to On-line Recommender Measures via Scaled Fuzzy Implicators Ladislav Peska Faculty of Mathematics and Physics Charles University, Prague, Czechia peska@ksi.mff.cuni.cz Peter Vojtas Faculty of Mathematics and Physics Charles University, Prague, Czechia vojtas@ksi.mff.cuni.cz Abstract—This paper introduces fuzzy Challenge Response Framework, designed to understand the relationship between the model of a real-world situation and some real observations, based on scaled fuzzy Implicators between them. This general framework is applied to a particular case in recommender systems: the prediction of on-line performance given off-line evaluation results. We perform an empirical evaluation with real data from a Czech travel agency, comparing different recommender algorithms, different metrics for on-line and off- line evaluations, and different implication operators. Index Terms—fuzzy web intelligence, recommender systems, fuzzy decision support systems, on-line vs. off-line evaluation I. I NTRODUCTION Theoretical algorithmic models are required to be sound and complete. That is, computed results should be correct and all correct results should be computable. More challenging are scenarios, where models are connected to observable reality (either physical, e.g. weather forecast, or biological, e.g. diagnosis in medicine). At this point, the problem of how to measure soundness and completeness arises. However, the situation becomes even more challenging when human psychology is involved. As an example, one class of such real situations are users/customers aiming to buy some product in an e-shop and recommender systems (RS) aiming to model preferences of users via observing their behavior. Instead of correct answers RS responds with an ordered list of items, which correspond to the model of user’s preferences. Sound- ness can be understood as a degree of user’s satisfaction with this ordered response. Soundness becomes the only realistic goal (it is unrealistic to ask for completeness, if the human evaluation is involved, or e.g. while considering problems on the open web). Jannach and Jugovac [1] critically discussed the value of algorithmic improvements in off-line recommender systems evaluation scenarios, which are common in academia. On the other hand, on-line evaluation in real-world scenarios has also certain drawbacks, such as high resource demands, temporal complexity, the lack of repeatability or potential negative impact on the user experience [2]. Nonetheless, the connection between off-line and on-line evaluation (and particular metrics This paper has been supported by Czech Science Foundation (GA ˇ CR) project Nr. 19-22071Y and by Charles University project Progres Q48. Source codes, evaluation data and complete results are available from github.com/lpeska/FUZZ-IEEE2020. utilized in each scenario) is often unclear and intensively re- searched. Therefore, we selected the problem of RS evaluation as a use-case for the proposed fuzzy Challenge Response Framework (fChRF). We understand the relationship between a solution (model, algorithm) and relevance/satisfaction of the user as an fuzzy set inclusion/implication (e.g., computed → correct, model → reality, or off-line evaluation → on-line evaluation for our use- case). Many observed phenomena in RS are inherently fuzzy. This leads us to consider fuzzy implicators while measuring the success of the models. Fuzzy logic has been used for ﬂexible database querying for more than 30 years. As early as in the works of Zim- mermann [3] and Fagin [4], [5], fuzzy sets were used as score interpreted as coding ordering of query results. In [6] Bordogna et al. reviewed the role of the inclusion operator in the interpretation of queries addressed to databases and Information Retrieval systems. Dubois and Prade [7] identiﬁed the role of fuzzy sets in answering queries with incomplete data and/or with ambiguity. Bosc and Pivert [8] analyzed trade- off non-commutative operators (e.g. convex combination of conjunctive and disjunctive ones), enabling merging positive and negative judgements. In general, we follow the idea of Bellman et al. [9], where real world signal data and application needs contributed to the invention of fuzzy sets model. Likewise, we base our work also on a real world data and use-case. The idea of Challenge Response Framework (ChRF) was motivated by the work of Galois [10]. Galois dealt with the problem of existence of formula for roots of higher degree polynomials. He constructed a correspondence between ﬁelds and groups acting on roots in such a way we can gather information about the group’s structure from the ﬁeld’s structure and vice versa 1 . Motivated by Galois, in [11] we introduced Galois-Tukey (GT) connections using correspon- dence to gather information between structures of real line (e.g topology and measure). In [12] Blass interpreted GT connections as complexity reductions in computer science and illustrated it on the reduction of the 3SAT search problem to the 3COL search problem (vertex 3-colorability of graphs). Challenges are sets of formulas/graphs; responses are variable 1 see https://www.math3ma.com/blog/what-is-galois-theory-anyway 978-1-7281-6932-3/20/$31.00 ©2020 IEEE