IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 60, 2022 5509612
Hyperspectral Image Classification Using
Attention-Based Bidirectional Long
Short-Term Memory Network
Shaohui Mei , Senior Member, IEEE, Xingang Li, Xiao Liu, Huimin Cai, and Qian Du , Fellow, IEEE
Abstract—Deep neural networks have been widely applied to
hyperspectral image (HSI) classification areas, in which recurrent
neural network (RNN) is one of the most typical networks. Most
of the existing RNN-based classifiers treat the spectral signature
of pixels as an ordered sequence, in which only unidirectional
correlation along the wavelength direction of adjacent bands is
considered. However, each band image is related to not only
its preceding band images but also its successive band images.
In order to fully explore such bidirectional spectral correlation
within an HSI, in this article, a bidirectional long short-term
memory (Bi-LSTM)-based network is designed for HSI clas-
sification. Moreover, a spatial–spectral attention mechanism is
designed and implemented in the proposed Bi-LSTM network to
emphasize the effective information and reduce the redundant
information among spatial–spectral context of pixels, by which
the performance of classification can be greatly improved.
Experimental results over three benchmark HSIs, i.e., Salinas
Valley, Pavia Centre, and Pavia University, demonstrate that our
proposed Bi-LSTM obviously outperforms several state-of-the-art
unidirectional RNN-based classification algorithms. Moreover,
the proposed spatial–spectral attention mechanism can further
improve the classification accuracy of our proposed Bi-LSTM
algorithm by effectively weighting spatial and spectral context
of pixels. The source code of the proposed Bi-LSTM algorithm
is available at https://github.com/MeiShaohui/Attention-based-
Bidirectional-LSTM-Network.
Index Terms— Attention network, classification, deep learning,
hyperspectral image (HSI), recurrent neural network (RNN).
I. I NTRODUCTION
H
YPERSPECTRAL imaging sensors can obtain abundant
spectral information of objects while preserving their
spatial information, which enables to explore spectral and
spatial characteristics. Compared with convolutional color
images or multispectral remote sensing images, hyperspectral
images (HSIs) have greatly improved in information richness,
Manuscript received February 7, 2021; revised June 24, 2021 and July 16,
2021; accepted July 27, 2021. Date of publication August 11, 2021; date
of current version January 17, 2022. This work was supported in part by
the Fundamental Research Funds for the Central Universities and in part by
the National Natural Science Foundation of China under Grant 61671383.
(Corresponding author: Shaohui Mei.)
Shaohui Mei, Xingang Li, and Xiao Liu are with the School of Electron-
ics and Information, Northwestern Polytechnical University, Xi’an, Shaanxi
710129, China (e-mail: meish@nwpu.edu.cn).
Huimin Cai is with Tianjin Jinhang Institute of Technical Physics,
Tianjin 300192, China.
Qian Du is with the Department of Electrical and Computer Engineering,
Mississippi State University, Starkville, MS 39762 USA.
Digital Object Identifier 10.1109/TGRS.2021.3102034
leading to great interest in many applicational fields, such as
medicine, agriculture, industry, and food [1]–[4].
HSI classification, which assigns labels to different pixels
by exploring their spectral signature and spatial context, has
attracted great attention in past decades. A simple way for such
a purpose is directly feeding spectral pixel vectors into conven-
tional classifiers [5]. For example, Melgani and Bruzzone [6]
and Camps-Valls et al. [7] addressed the problem of the clas-
sification of HSIs by support vector machines (SVMs). Ham
et al. [8] and Belgiu and Dr˘ agu¸ t [9] proposed to classify HSIs
using a random forest (RF) classifier. However, due to high
spectral dimensionality, directly using the spectral information
of HSIs can easily lead to the curve of dimensionality (i.e.,
the Hughes effect) [10], [11]. Therefore, many methods were
proposed to explore discriminative features implied in the
high-dimensional spectral signatures [12], [13], among which
the representative algorithms are principal component analysis
(PCA) [14]–[16], linear discriminant analysis (LDA) [17],
manifold learning-based methods [18], [19], and graph embed-
ding [20], [21].
In recent years, many attempts for HSI classification
have been made with deep learning [22], [23], in which
convolutional neural network (CNN) has achieved many suc-
cesses [24], [25]. Generally, CNN can conduct feature extrac-
tion over HSIs in different dimensions for classification tasks.
For example, 1-dimensional CNN (1DCNN) directly handles
1-D spectral vectors to the network for classification [26],
[27], by which existing relationships between the spectral
signatures associated with each HSI pixel and the information
contained in them are learned [28], [29]. In order to learn
spatial features representations from the data, 2-D-CNN is
used to handle dimensionality-reduced hyperspectral data by
PCA for classification [30]–[32]. In order to well explore the
high-dimensional data structure of HSIs, 3-D convolution is
directly used in many CNNs to explore the spatial–spectral
property of HSIs for classification [33], [34], such as multi-
scale 3D deep CNN (M3D-CNN) [35] and HSI-CNN [36].
The lightweight version of CNN-based HSI classification has
also been explored in [37] and [38]. The autoencoders (AEs)
have also been used as deep models to perform unsuper-
vised coding from HSI data. For example, an unsupervised
tied AE (TAE) was proposed for spectral feature extrac-
tion [39]. Spectral–spatial feature extraction has also been
implemented using AE-based networks, such as stacked AE
1558-0644 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Mississippi State University Libraries. Downloaded on January 19,2022 at 00:36:11 UTC from IEEE Xplore. Restrictions apply.