Computers and Electrical Engineering 73 (2019) 1–8 Contents lists available at ScienceDirect Computers and Electrical Engineering journal homepage: www.elsevier.com/locate/compeleceng Pulsar candidate recognition with deep learning Haoyuan Zhang a,b, , Zhen Zhao a , Tao An a,c , Baoqiang Lao a , Xiao Chen a a Shanghai Astronomical Observatory, Chinese Academy of Sciences, 200030 Shanghai, China b University of Chinese Academy of Sciences, 100049 Beijing, China c Key Laboratory of Radio Astronomy, Chinese Academy of Sciences, 210008 Nanjing, China a r t i c l e i n f o Article history: Received 29 August 2018 Revised 25 October 2018 Accepted 25 October 2018 Keywords: Pulsar candidate classification Radio astronomy Machine learning Methods and techniques Convolutional neural network Square kilometer array a b s t r a c t In this paper, we present a deep learning-based recognition algorithm to identify pulsars by observing data containing millions of candidates including radio frequency interference and noise sources. The dataset is obtained from the High Time Resolution Universe sur- vey created and updated by the Parkes telescope. We investigate several effective single and combined features via simple logistic regression. To deal with the imbalanced dataset, we oversimplify the original dataset at different sampling rates, which is also one of the learning parameters. After training the pre-processed dataset via a convolutional neural network, we provide a cross-validated evaluation of all candidates. Results show that the deep-learning based recognition algorithm can identify the pulsar and radio frequency in- terference signals with high accuracy. The precision and recall of radio frequency interfer- ence are both 100%, and those of pulsars are 91% and 94%, respectively. © 2018 Elsevier Ltd. All rights reserved. 1. Introduction Large amounts of pulsar data are typically required by astrophysicists to find statistically-significant relationships needed to find pulsars. The pulsar candidate selection problem is important and meaningful because it is an important step to find new pulsars. Recently, machine learning methods have been widely used for pulsar candidate selection problems [1–5]. However, with the advent of the Square Kilometer Array (SKA) radio telescope, the data volume has become extremely high. On the one hand, large-volume data provides a great opportunity to find more pulsars, but on the other hand, processing big data sets can become a daunting task rather quickly. The simple reason for this is that traditional machine learning methods cannot meet the SKA data challenge. Traditional machine learning methods find patterns from features extracted from the data [6,7]. This pattern recognition step does not work effectively for pulsar data. Unlike traditional machine learning methods, deep learning methods are used to learn directly from data. The development of an accelerator technique, e.g., graphics processing units (GPU), significantly expands the capacity of deep learning methods to deal with big data. Hinton applied deep neural networks (DNN) to classification problems and obtained highly accurate results [8]. In addition to highly accurate results, processing speed is also an important factor to consider. To increase the training speed, we adopt convolutional neural networks (CNN) in pulsar identification, which have fewer parameters and are thus faster than the DNNs. In this work, we effectively use data architecture to implement learning methods directly to raw data to reduce the system error and obtain highly accurate results. Additionally, by combining the L2 regularization step with a dropout step, we ensure that our model Corresponding author at: Shanghai Astronomical Observatory, Chinese Academy of Sciences, 200030 Shanghai, China. E-mail address: zhy@shao.ac.cn (H. Zhang). https://doi.org/10.1016/j.compeleceng.2018.10.016 0045-7906/© 2018 Elsevier Ltd. All rights reserved.