Received May 31, 2020, accepted June 16, 2020, date of publication June 23, 2020, date of current version July 2, 2020. Digital Object Identifier 10.1109/ACCESS.2020.3004409 Confident Classification Using a Hybrid Between Deterministic and Probabilistic Convolutional Neural Networks MUHAMMAD NASEER BAJWA 1,3,* , (Graduate Student Member, IEEE), SULEMAN KHURRAM 1,3,* , MOHSIN MUNIR 1,3,* , SHOAIB AHMED SIDDIQUI 1,3 , MUHAMMAD IMRAN MALIK 2,4 , ANDREAS DENGEL 1,3 , AND SHERAZ AHMED 3 1 Fachbereich Informatik, Technische Universität Kaiserslautern, 67663 Kaiserslautern, Germany 2 School of Electrical Engineering and Computer Science, National University of Science and Technology (NUST), Islamabad 46000, Pakistan 3 German Research Center for Artificial Intelligence GmbH (DFKI), 67663 Kaiserslautern, Germany 4 Deep Learning Laboratory, National Center of Artificial Intelligence, Islamabad 46000, Pakistan * Muhammad Naseer Bajwa, Suleman Khurram, and Mohsin Munir contributed equally to this work. Corresponding author: Muhammad Naseer Bajwa (naseer.bajwa@dfki.de) This work was supported in part by the National University of Science and Technology (NUST), Pakistan, through the Prime Minister’s Program for Development of the Ph.D. in Science and Technology, in part by the Bundesministerium für Bildung und Forschung (BMBF) Project Deep Fusion for Neural Networks (DeFuseNN) under Grant 01IW17002, and in part by the NVIDIA Artificial Intelligence Laboratory (NVAIL) Program. ABSTRACT Traditional neural networks trained using point-based maximum likelihood estimation are deterministic models and have exhibited near-human performance in many image classification tasks. However, their insistence on representing network parameters with point-estimates renders them incapable of capturing all possible combinations of the weights; consequently, resulting in a biased predictor towards their initialisation. Most importantly, these deterministic networks are inherently unable to provide any uncertainty estimate for their prediction which is highly sought after in many critical application areas. On the other hand, Bayesian neural networks place a probability distribution on network weights and give a built-in regularisation effect making these models able to learn well from small datasets without overfitting. These networks provide a way of generating posterior distribution which can be used for model’s uncertainty estimation. However, Bayesian estimation is computationally very expensive since it greatly widens the parameter space. This paper proposes a hybrid convolutional neural network which combines high accuracy of deterministic models with posterior distribution approximation of Bayesian neural networks. This hybrid architecture is validated on 13 publicly available benchmark classification datasets from a wide range of domains and different modalities like natural scene images, medical images, and time-series. Our results show that the proposed hybrid approach performs better than both deterministic and Bayesian methods in terms of classification accuracy and also provides an estimate of uncertainty for every prediction. We further employ this uncertainty to filter out unconfident predictions and achieve significant additional gain in accuracy for the remaining predictions. INDEX TERMS Bayesian estimation, convolutional neural networks, hybrid neural networks, image classification, time-series classification, uncertainty estimation. I. INTRODUCTION Over the last decade, Convolutional Neural Networks (CNNs) have made phenomenal strides in various classification tasks using a wide array of input modalities. These powerful algorithms have achieved impressive performance, often at par with human experts, in many challenging natural scene The associate editor coordinating the review of this manuscript and approving it for publication was Sabu M Thampi . image recognition tasks [1]–[3] and even in sensitive and critical application areas like medical image analysis for disease prediction [4]–[8]. These CNNs gained significant attention due to their parameter efficiency, in contrast to other deep learning models like densely connected Multi-Layer Perceptrons (MLPs), resulting in comparatively better gen- eralisation performance. They are particularly powerful in analysing visual modalities like images and videos [9] but have also proved their worth in time-series analysis where 115476 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ VOLUME 8, 2020