Computers and Electrical Engineering 73 (2019) 1–8
Contents lists available at ScienceDirect
Computers and Electrical Engineering
journal homepage: www.elsevier.com/locate/compeleceng
Pulsar candidate recognition with deep learning
Haoyuan Zhang
a,b,∗
, Zhen Zhao
a
, Tao An
a,c
, Baoqiang Lao
a
, Xiao Chen
a
a
Shanghai Astronomical Observatory, Chinese Academy of Sciences, 200030 Shanghai, China
b
University of Chinese Academy of Sciences, 100049 Beijing, China
c
Key Laboratory of Radio Astronomy, Chinese Academy of Sciences, 210008 Nanjing, China
a r t i c l e i n f o
Article history:
Received 29 August 2018
Revised 25 October 2018
Accepted 25 October 2018
Keywords:
Pulsar candidate classification
Radio astronomy
Machine learning
Methods and techniques
Convolutional neural network
Square kilometer array
a b s t r a c t
In this paper, we present a deep learning-based recognition algorithm to identify pulsars
by observing data containing millions of candidates including radio frequency interference
and noise sources. The dataset is obtained from the High Time Resolution Universe sur-
vey created and updated by the Parkes telescope. We investigate several effective single
and combined features via simple logistic regression. To deal with the imbalanced dataset,
we oversimplify the original dataset at different sampling rates, which is also one of the
learning parameters. After training the pre-processed dataset via a convolutional neural
network, we provide a cross-validated evaluation of all candidates. Results show that the
deep-learning based recognition algorithm can identify the pulsar and radio frequency in-
terference signals with high accuracy. The precision and recall of radio frequency interfer-
ence are both 100%, and those of pulsars are 91% and 94%, respectively.
© 2018 Elsevier Ltd. All rights reserved.
1. Introduction
Large amounts of pulsar data are typically required by astrophysicists to find statistically-significant relationships needed
to find pulsars. The pulsar candidate selection problem is important and meaningful because it is an important step to find
new pulsars.
Recently, machine learning methods have been widely used for pulsar candidate selection problems [1–5]. However, with
the advent of the Square Kilometer Array (SKA) radio telescope, the data volume has become extremely high. On the one
hand, large-volume data provides a great opportunity to find more pulsars, but on the other hand, processing big data sets
can become a daunting task rather quickly. The simple reason for this is that traditional machine learning methods cannot
meet the SKA data challenge. Traditional machine learning methods find patterns from features extracted from the data [6,7].
This pattern recognition step does not work effectively for pulsar data. Unlike traditional machine learning methods, deep
learning methods are used to learn directly from data. The development of an accelerator technique, e.g., graphics processing
units (GPU), significantly expands the capacity of deep learning methods to deal with big data. Hinton applied deep neural
networks (DNN) to classification problems and obtained highly accurate results [8]. In addition to highly accurate results,
processing speed is also an important factor to consider. To increase the training speed, we adopt convolutional neural
networks (CNN) in pulsar identification, which have fewer parameters and are thus faster than the DNNs. In this work, we
effectively use data architecture to implement learning methods directly to raw data to reduce the system error and obtain
highly accurate results. Additionally, by combining the L2 regularization step with a dropout step, we ensure that our model
∗
Corresponding author at: Shanghai Astronomical Observatory, Chinese Academy of Sciences, 200030 Shanghai, China.
E-mail address: zhy@shao.ac.cn (H. Zhang).
https://doi.org/10.1016/j.compeleceng.2018.10.016
0045-7906/© 2018 Elsevier Ltd. All rights reserved.