Received August 22, 2021, accepted November 19, 2021, date of publication December 28, 2021, date of current version January 20, 2022. Digital Object Identifier 10.1109/ACCESS.2021.3138978 One Versus All for Deep Neural Network for Uncertainty (OVNNI) Quantification GIANNI FRANCHI 1 , ANDREI BURSUC 2 , EMANUEL ALDEA 3 , (Member, IEEE), SÉVERINE DUBUISSON 4 , AND ISABELLE BLOCH 5 1 ENSTA Paris, Institut Polytechnique de Paris, 91120 Palaiseau, France 2 Valeo.AI, 75017 Paris, France 3 SATIE, Université Paris-Saclay, 91190 Gif sur Yvette, France 4 CNRS, LIS, Aix Marseille University, 13007 Marseille, France 5 Sorbonne Université, CNRS, LIP6, 75005 Paris, France Corresponding author: Gianni Franchi (gianni.franchi@ensta-paris.fr) This work was supported in part by the Agency (ANR) Project MOHICANS under Grant ANR-15-CE39-0005, ‘‘Towards Modelling High-density Crowds for Assisting Planning and Safety,’’ and in part by the Saclay-IA Cluster and CNRS Jean-Zay Supercomputer for the computation resources under Grant 2020-AD011011970. ABSTRACT Deep neural networks (DNNs) are powerful learning models, yet their results are not always reliable. This drawback results from the fact that modern DNNs are usually overconﬁdent, and consequently their epistemic uncertainty cannot be straightforwardly characterized. In this work, we propose a new tech- nique to quantify easily the epistemic uncertainty of data. This method consists in mixing the predictions of an ensemble of DNNs trained to classify One class versus All the other classes (OVA) with predictions from a standard DNN trained to perform All versus All (AVA) classiﬁcation. First of all, the adjustment provided by the AVA DNN to the score of the base classiﬁers allows for a more ﬁne-grained inter-class separation. Moreover, the two types of classiﬁers enforce mutually their detection of out-of-distribution (OOD) samples, circumventing entirely the requirement of using such samples during training. The additional cost involved by the construction of the ensemble is offset by the ease of use of our proposed strategy and by its enhanced generalization potential, as it does not bind its performance in a given context to speciﬁc OOD datasets. The extensive experiments conﬁrm the wide applicability of our approach, and our method achieves state of the art performance in quantifying OOD data across multiple datasets and architectures while requiring little hyper-parameter tuning. INDEX TERMS Uncertainty estimation, DNN ensembles, one vs all classiﬁcation, all vs all classiﬁcation. I. INTRODUCTION Anomaly detection is the task of detecting data that deviate from the training distribution. Deep neural networks (DNNs) have reached state-of-the-art performance on machine learn- ing [20], [45], and computer vision tasks [40], [70]. Signiﬁ- cant progress has raised interest in adopting them in a wide range of decision-making systems, including safety-critical ones. Yet, one of the main weaknesses of these techniques is that they tend to be overconﬁdent [22] in their decisions, even when they are wrong [22], [25], [63]. This leads to DNNs that might miss detecting anomalies. This issue is difﬁcult to tackle, as the high inner complexity of DNNs results in a poor output explainability. Anomaly or outlier detection is a wide thematic of research [66]. The objective is to detect rare or corrupted data, The associate editor coordinating the review of this manuscript and approving it for publication was Mingbo Zhao . that are different from what we consider to be normal data. This research topic has multiple practical applications, such as risk management [81], safety [69] or automatic inspection and non destructive control [38]. Anomalies can also be linked to the knowledge uncertainty [27] of the DNNs. The precise identiﬁcation of anomalies in DNN predictions is crucial for improving the reliability of such models, and a key step towards their deployment in practical settings. In order to address this important problem, we propose to leverage a ﬁner quantiﬁcation of the uncertainty of DNNs. In contrast to most Bayesian DNN techniques [4], [17], [18], [35], [56], or to frequentist techniques such as Deep Ensembles [43], our approach relies on One versus All (OVA) training. In the statistical learning community, ensembles of OVA or One versus One (OVO) base classiﬁers for multi-class prediction have been particularly popular in asso- ciation with Support Vector Machines (SVM), due to SVM being essentially a binary classiﬁer, and to the simplicity of 7300 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ VOLUME 10, 2022