Received May 2, 2021, accepted May 20, 2021, date of publication June 21, 2021, date of current version June 29, 2021. Digital Object Identifier 10.1109/ACCESS.2021.3090918 Gradient Descent Effects on Differential Neural Architecture Search: A Survey SANTANU SANTRA 1 , JUN-WEI HSIEH 2 , AND CHI-FANG LIN 1 1 Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 32003, Taiwan 2 College of Artiﬁcial Intelligence and Green Energy, National Chiao Tung University, Hsinchu 30010, Taiwan Corresponding author: Jun-Wei Hsieh (jwhsieh@nctu.edu.tw) This work was supported by the Ministry of Science and Technology (MOST), Taiwan, under Grant 109-2221-E-009-116-MY3. ABSTRACT Gradient Descent, an effective way to search for the local minimum of a function, can minimize training and validation loss of neural architectures and also be incited in an appropriate order to decrease the searching cost of neural architecture search. In recent trends, the neural architecture search (NAS) is enormously used to construct an automatic architecture for a speciﬁc task. Mostly well-performed neural architecture search methods have adopted reinforcement learning, evolutionary algorithms, or gradient descent algorithms to ﬁnd the best-performing candidate architecture. Among these methods, gradient descent-based architecture search approaches outperform all other methods in terms of efﬁciency, simplicity, computational cost, and validation error. In view of this, an in-depth survey is necessary to cover the usefulness of gradient descent method and how this can beneﬁt neural architecture search. We begin our survey with basic concepts of neural architecture search, gradient descent, and their unique properties. Our survey then delves into the impact of gradient descent method on NAS and explores the effect of gradient descent in the search process to generate the candidate architecture. At the same time, our survey reviews mostly used gradient-based search approaches in NAS. Finally, we provide the current research challenges and open problems in the NAS-based approaches, which need to be addressed in future research. INDEX TERMS Gradient descent, neural architecture search, reinforcement learning, evolutionary algo- rithm, back-propagation. I. INTRODUCTION Automatic machine learning (AutoML) has become a favor- able solution for developing deep learning (DL) systems without any human efforts. An AutoML system consists of data preprocessing, feature generation, network model gen- eration, and performance evaluation. Although an AutoML system consists of several stages, the most critical stages are model generation and performance estimation. The model generation stage is either created by machine learning experts or by an automatic design process. The automated architec- ture design process is known as neural architecture search (NAS). The rapid development and demand of NAS continue to overwhelm the human experts designing architectures in many applications. Constructing an automatic architecture with different net- work topologies is ﬁrst explored in [1]. The pioneer- ing frameworks developed in [2] and [3] have attracted The associate editor coordinating the review of this manuscript and approving it for publication was Yiming Tang . much attention, which bring many exciting ideas for NAS with high-performance outputs. Unfortunately, most NAS approaches require many GPU days and memory, which make the NAS approaches fatally hindered. Hence, advanced approaches that ensure low memory, low comput- ing resources, and power requirements over neural architec- ture search are obligatory. Apart from satisfying low memory and computing resources requirements of searching processes, NAS approaches must include some features, such as scalability, efﬁciency, reliability, and ﬂexibility. Candidate architec- ture searches in NAS can be performed by reinforcement learning (RL), evolutionary algorithm (EA), gradient-based (GB), or random search (RS) approaches. At present, the gradient-based NAS approach [4]–[8] is considered one of the better candidates for architecture search strategies. Gradient descent has the ability to search for better architec- tures with a local (or preferably global) minimum to satisfy the requirements, including low memory and computational loading. It is often adopted in back-propagation to repeatedly 89602 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ VOLUME 9, 2021