Semi-Supervised Approach Using Nearest Neighbors Clustering and Deep Learning Bruno V. A. Lima Adri˜ ao D. D. Neto ∗∗ ucia E. S. Silva ∗∗∗ Vinicius P. Machado ∗∗∗∗ Jo˜ ao G. C. Costa Department of Computer Engineering and Automation, Federal University of Rio Grande do Norte, RN, (e-mail: brunovicente@ufrn.edu.br). ∗∗ Department of Computer Engineering and Automation, Federal University of Rio Grande do Norte, RN (e-mail: adriao@dca.ufrn.br) ∗∗∗ Department of Computer, Federal University of Piaui, PI, (e-mail:luciaemilia72@gmail.com) ∗∗∗∗ Department of Computer, Federal University of Piaui, PI, (e-mail:vinicius@ufpi.edu.br) Institute of Mathematical and Computer Sciences , Universidad of S˜ao Paulo, SP, (e-mail:Joaogcavalcanti@usp.br) Abstract: The field of Deep Learning is in constant evolution, with new techniques and applications being developed by the day. Of those techniques, semi-supervised deep learning have promising results, especially in combination with the standard Convolutional Neural Network (CNN) architectures. CNNs attain state-of-the-art performance on various classification tasks assuming a sufficiently large number of labeled training examples. Unfortunately, labeling sufficiently large training data sets requires human involvement, which is expensive and time consuming. In semi-supervised learning there is not only a set of labeled samples (L), but also a set of unlabeled samples (U ), which is generally greater than the first (U>L). This paper presents a semi-supervised model using a CNN supported by a Multilayer Perceprton (MLP) network, and a clustering process by k Nearest Labeled Neighbors. The results showed that the proposed model solves the semi-supervised learning problem over different scenarios. Keywords: Semi-supervised; Deep Learning; Neighbors; Labeling. 1. INTRODUCTION Recent developments in technology increased the possibi- lities on how to obtain data. For example, one can use the Internet to search any information on economics, in- dustrial designs, health problems, scientific results and so on. The technological advance has also contributed to an accumulation of information, by increasing the capacity of storage devices with higher speed of access and lower cost. Some of this information can be analyzed in order to generate knowledge that assists in decision-making tasks related to the data. Usually, machine learning techniques are used to make such analyzes. One of those techniques is semi-supervised learning, which handles classification problems. In these problems, a small amount of samples have a label which specifies the class of the sample. In contrast, most of the samples are unlabeled, meaning they are not previously classified. This group is usually larger because it is easier to acquire such type of information, as it does not require a prior processing. In the context of semi-supervised learning, techniques are needed to process labeled and unlabeled data simultane- ously. In this sense, this paper presents a model for working with semi-supervised learning using a Deep Learning te- chnique known as Convolutional Neural Network (CNN), and supported by a Multilayer Perceptron network (MLP). Experimental tests have shown the efficiency of the mo- del in solving semi-supervised learning problems, such as processes clustering, data labeling, and classifying data. Outperforming the classical methods Co-training and SE- EDED K-means. 2. RELATED WORKS In Sajjadi et al. (2016) the problem of semi-supervised lear- ning with deep Convolutional Neural Networks is introdu- ced. The authors propose an unsupervised regularization that explicitly forces the classifier’s prediction for multiple classes to be mutually exclusive and effectively guides the decision boundary to be in the low-density space between the manifolds corresponding to different classes of data. The method can be aplied to images. In Chamberlain et al. (2016) it is presented the deve- lopment of a semi-supervised deep learning algorithm for automatically classify lung sounds using a Convolutional Neural Network, this paper solver a specificy problem with Deep Semi-supervised. In Noroozi et al. (2017), it is proposed a deep semisupervi- sed model named SEmi-supervised VErification Network DOI: 10.17648/sbai-2019-111379 1676