This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 1 Deep Learning for Predicting Significant Wave Height From Synthetic Aperture Radar Brandon Quach , Yannik Glaser , Justin Edward Stopa , Alexis Aurélien Mouche , and Peter Sadowski Abstract— The Sentinel-1 satellites equipped with synthetic aperture radars (SARs) provide near-global coverage of the world’s oceans every six days. We curate a data set of collocations between SAR and altimeter satellites and investigate the use of deep learning to predict significant wave height from SAR. While previous models for predicting geophysical quantities from SAR rely heavily on feature-engineering, our approach learns directly from low-level image cross-spectra. Training on collocations from 2015 to 2017, we demonstrate on test data from 2018 that deep learning reduces the state-of-the-art root mean squared error by 50%, from 0.6 to 0.3 m when compared to altimeter data. Furthermore, we isolate the contributions of different features to the model performance. Index Terms— CWAVE, deep learning, machine learning, neural networks, Sentinel-1, significant wave height, synthetic aperture radar (SAR). I. I NTRODUCTION S YNTHETIC aperture radar (SAR) enables us to measure submesoscale phenomena with unprecedented coverage, resolution, and frequency. By measuring the backscatter from the ocean surface, SAR captures information about ocean swells and sea surface roughness at high spatial resolutions ( <10 m) [1], from which many oceanic, atmospheric, and biologic phenomena can be identified [2]. The two Sentinel- 1 satellites of the European Space Agency (ESA) take regular SAR measurements of the ocean surface, together covering the entire globe every six days [3], and have already accumulated more than 600 TB of level-1 (L1) wave mode data. However, in order to take full advantage of this technology and the tor- rent of data being produced, new methods are needed to extract useful information from the high-dimensional measurements. Sea state information extracted from SAR has been instru- mental in understanding swell decay [1], [4], [5], improving swell propagation in numerical models [6], and predicting swell amplitudes and arrivals times by assimilation into numer- ical models [7]. SAR can also be used to estimate extreme Manuscript received February 14, 2020; revised May 22, 2020; accepted June 8, 2020. (Corresponding author: Justin Edward Stopa.) Brandon Quach is with the Computing and Mathematical Sciences Depart- ment, California Institute of Technology, Pasadena, CA 91125-0002 USA, and also with the Information and Computer Sciences Department, University of Hawai’i at M¯ anoa, Honolulu, HI 96822 USA. Yannik Glaser and Peter Sadowski are with Information and Computer Sciences Department, University of Hawai’i at M¯ anoa, Honolulu, HI 96822 USA. Justin Edward Stopa is with Ocean Engineering Department, University of Hawai’i at M¯ anoa, Honolulu, HI 96822 USA (e-mail: stopa@hawaii.edu). Alexis Aurélien Mouche is with the Univ. Brest, CNRS, IRD, IFRE- MER, Laboratoire d’Océanographie Physique et Spatiale (LOPS), IUEM, 29280 Brest, France. Color versions of one or more of the figures in this article are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TGRS.2020.3003839 sea states in extra-tropical and tropical cyclones [8]–[10]. A geophysical quantity of particular interest is the significant wave height, H s , defined as the mean of the top third of a wave height distribution, and estimating H s from SAR has immediate practical uses in alerting ships to dangerously large waves. Traditional “inverse” algorithms for inferring H s from SAR are slow and perform poorly in windy conditions typical of most storms [11], [12] because of the complex nonlinear mechanism involved in the image synthesis when observing moving scenes. As a result, several recent studies have focused on data-driven statistical models [8]–[10], [13]. Previous data-driven approaches for predicting H s from SAR used small data sets of buoy observations as targets for training ( <5000 examples) [14]–[16], or numerical mod- els of global wave generation such as WAVEWATCH3 [8], [10], [13], [17]. The current state-of-the-art method uses a neural network trained on the latter, and predicts H s with 0.6-m root mean squared error (RMSE) [10]. However, the WAVEWATCH3 targets are only an estimate of H s and are known to be unreliable in high sea states [18]–[20]. Furthermore, the neural network in [10] relies on a reduced representation of the modulation cross-spectra: a set of 22 engineered features known as CWAVE [13]. Such dimensionality-reduction methods can be very useful, but often come at the cost of discarding relevant information. We hypothesize that the SAR image modulation spectra con- tains additional information about H s that is lost by the CWAVE dimensionality-reduction step. We propose to learn the relevant intermediate data representations using deep learn- ing with artificial neural networks, similar to what has been done in other fields from computer vision [21] to high-energy physics [22]–[24]. In this work, we address both limitations of current data- driven H s prediction models. First, we curate a data set containing direct observations of ocean wave heights by iden- tifying 750,000 collocations of SAR and altimeter satellites. Second, we train a statistical model to extract information directly from low-level SAR image spectra using deep learn- ing. Finally, we analyze the importance of the different inputs to this model, and its performance in different settings. II. DATA AND METHODS A. Sensors, Collocations and Preprocessing Our first contribution is a data set of historical measure- ments from two types of polar-orbiting satellites: Sentinel-1 SAR satellites and altimeter satellites. Because the satel- lites are in different orbits, their paths intersect, providing 0196-2892 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information. Authorized licensed use limited to: Univ of Calif San Diego. Downloaded on August 03,2020 at 21:11:35 UTC from IEEE Xplore. Restrictions apply.