IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 60, 2022 5509612 Hyperspectral Image Classification Using Attention-Based Bidirectional Long Short-Term Memory Network Shaohui Mei , Senior Member, IEEE, Xingang Li, Xiao Liu, Huimin Cai, and Qian Du , Fellow, IEEE Abstract—Deep neural networks have been widely applied to hyperspectral image (HSI) classification areas, in which recurrent neural network (RNN) is one of the most typical networks. Most of the existing RNN-based classifiers treat the spectral signature of pixels as an ordered sequence, in which only unidirectional correlation along the wavelength direction of adjacent bands is considered. However, each band image is related to not only its preceding band images but also its successive band images. In order to fully explore such bidirectional spectral correlation within an HSI, in this article, a bidirectional long short-term memory (Bi-LSTM)-based network is designed for HSI clas- sification. Moreover, a spatial–spectral attention mechanism is designed and implemented in the proposed Bi-LSTM network to emphasize the effective information and reduce the redundant information among spatial–spectral context of pixels, by which the performance of classification can be greatly improved. Experimental results over three benchmark HSIs, i.e., Salinas Valley, Pavia Centre, and Pavia University, demonstrate that our proposed Bi-LSTM obviously outperforms several state-of-the-art unidirectional RNN-based classification algorithms. Moreover, the proposed spatial–spectral attention mechanism can further improve the classification accuracy of our proposed Bi-LSTM algorithm by effectively weighting spatial and spectral context of pixels. The source code of the proposed Bi-LSTM algorithm is available at https://github.com/MeiShaohui/Attention-based- Bidirectional-LSTM-Network. Index Terms— Attention network, classification, deep learning, hyperspectral image (HSI), recurrent neural network (RNN). I. I NTRODUCTION H YPERSPECTRAL imaging sensors can obtain abundant spectral information of objects while preserving their spatial information, which enables to explore spectral and spatial characteristics. Compared with convolutional color images or multispectral remote sensing images, hyperspectral images (HSIs) have greatly improved in information richness, Manuscript received February 7, 2021; revised June 24, 2021 and July 16, 2021; accepted July 27, 2021. Date of publication August 11, 2021; date of current version January 17, 2022. This work was supported in part by the Fundamental Research Funds for the Central Universities and in part by the National Natural Science Foundation of China under Grant 61671383. (Corresponding author: Shaohui Mei.) Shaohui Mei, Xingang Li, and Xiao Liu are with the School of Electron- ics and Information, Northwestern Polytechnical University, Xi’an, Shaanxi 710129, China (e-mail: meish@nwpu.edu.cn). Huimin Cai is with Tianjin Jinhang Institute of Technical Physics, Tianjin 300192, China. Qian Du is with the Department of Electrical and Computer Engineering, Mississippi State University, Starkville, MS 39762 USA. Digital Object Identifier 10.1109/TGRS.2021.3102034 leading to great interest in many applicational fields, such as medicine, agriculture, industry, and food [1]–[4]. HSI classification, which assigns labels to different pixels by exploring their spectral signature and spatial context, has attracted great attention in past decades. A simple way for such a purpose is directly feeding spectral pixel vectors into conven- tional classifiers [5]. For example, Melgani and Bruzzone [6] and Camps-Valls et al. [7] addressed the problem of the clas- sification of HSIs by support vector machines (SVMs). Ham et al. [8] and Belgiu and Dr˘ agu¸ t [9] proposed to classify HSIs using a random forest (RF) classifier. However, due to high spectral dimensionality, directly using the spectral information of HSIs can easily lead to the curve of dimensionality (i.e., the Hughes effect) [10], [11]. Therefore, many methods were proposed to explore discriminative features implied in the high-dimensional spectral signatures [12], [13], among which the representative algorithms are principal component analysis (PCA) [14]–[16], linear discriminant analysis (LDA) [17], manifold learning-based methods [18], [19], and graph embed- ding [20], [21]. In recent years, many attempts for HSI classification have been made with deep learning [22], [23], in which convolutional neural network (CNN) has achieved many suc- cesses [24], [25]. Generally, CNN can conduct feature extrac- tion over HSIs in different dimensions for classification tasks. For example, 1-dimensional CNN (1DCNN) directly handles 1-D spectral vectors to the network for classification [26], [27], by which existing relationships between the spectral signatures associated with each HSI pixel and the information contained in them are learned [28], [29]. In order to learn spatial features representations from the data, 2-D-CNN is used to handle dimensionality-reduced hyperspectral data by PCA for classification [30]–[32]. In order to well explore the high-dimensional data structure of HSIs, 3-D convolution is directly used in many CNNs to explore the spatial–spectral property of HSIs for classification [33], [34], such as multi- scale 3D deep CNN (M3D-CNN) [35] and HSI-CNN [36]. The lightweight version of CNN-based HSI classification has also been explored in [37] and [38]. The autoencoders (AEs) have also been used as deep models to perform unsuper- vised coding from HSI data. For example, an unsupervised tied AE (TAE) was proposed for spectral feature extrac- tion [39]. Spectral–spatial feature extraction has also been implemented using AE-based networks, such as stacked AE 1558-0644 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information. Authorized licensed use limited to: Mississippi State University Libraries. Downloaded on January 19,2022 at 00:36:11 UTC from IEEE Xplore. Restrictions apply.