Deep-STORM: super-resolution single-molecule microscopy by deep learning Elias Nehme 1,2,* , Lucien E. Weiss 2 , Tomer Michaeli 1 , and Yoav Shechtman 2 1 Department of Electrical Engineering, Technion, 32000 Haifa, Israel 2 Department of Biomedical Engineering, Technion, 32000 Haifa, Israel * Corresponding author: seliasne@campus.technion.ac.il Abstract We present an ultra-fast, precise, parameter-free method, which we term Deep-STORM, for obtaining super-resolution im- ages from stochastically-blinking emitters, such as ﬂuorescent molecules used for localization microscopy. Deep-STORM uses a deep convolutional neural network that can be trained on sim- ulated data or experimental measurements, both of which are demonstrated. The method achieves state-of-the-art resolution under challenging signal-to-noise conditions and high emitter densities, and is signiﬁcantly faster than existing approaches. Additionally, no prior information on the shape of the underly- ing structure is required, making the method applicable to any blinking data-set. We validate our approach by super-resolution image reconstruction of simulated and experimentally obtained data. 1 Introduction In conventional microscopy, the spatial resolution of an image is bounded by Abbe’s diffraction limit, corresponding to approxi- mately half the optical wavelength. Super resolution methods, e.g. stimulated emission depletion (STED) [1, 2], structured il- lumination micrscopy (SIM) [3–5], and localization microscopy, namely photo-activated localization microscopy ((F)PALM) [6, 7] and stochastic optical reconstruction microscopy (STORM) [8] have revolutionized biological imaging in the last decade, en- abling the observation of cellular structures at the nanoscale [9]. Localization microscopy relies on acquiring a sequence of diffraction-limited images, each containing point-spread func- tions (PSFs) produced by a sparse set of emitting ﬂuorophores. Next, the emitters are localized with high precision. By com- bining all of the recovered emitter positions from each frame, a super-resolved image is produced with resolution typically an order of magnitude better than the diffraction limit (down to tens of nanometers). In localization microscopy, regions with a high density of overlapping emitters pose an algorithmic challenge. This emitter-sparsity constraint leads to a long acquisition time (sec- onds to minutes), which limits the ability to capture fast dy- namics of sub-wavelength processes within live cells. Various algorithms have been developed to handle overlapping PSFs. Existing classes of algorithms are based on sequential ﬁtting of emitters, followed by subtraction of the model PSF [10–13]; blinking statistics [14–16]; sparsity [17–23]; multi-emitter max- imum likelihood estimation [24]; or even single-image super- resolution by dictionary learning [25, 26]. While successful lo- calization of densely-spaced emitters has been demonstrated, all existing methods suffer from two fundamental drawbacks: data-processing time and sample-dependent paramter tuning. Even accelerated sparse-recovery methods such as CEL0 [21], which employs the fast FISTA algorithm [27], still involve a time-consuming iterative procedure, and scale poorly with the recovered grid size. In addition, current methods rely on pa- rameters that balance different tradeoffs in the recovery process. These need to be tuned carefully through trial and error to obtain satisfactory results; ergo, requiring user expertise and tweaking- time. Here we demonstrate precise, fast, parameter-free, super- resolution image reconstruction by harnessing Deep-Learning. Convolutional neural networks have shown impressive results in a variety of image processing and computer-vision tasks, such as single-image resolution enhancement [28–32] and segmen- tation [33–35]. In this work, we employ a fully convolutional neural network for super-resolution image reconstruction from dense ﬁelds of overlapping emitters. Our method, dubbed Deep- STORM, does not explicitly localize emitters. Instead, it creates a super-resolved image from the raw data directly. The net pro- duces images with reconstruction resolution comparable or bet- ter than existing methods; furthermore, the method is extremely fast, and our software can leverage GPU computation for further enhanced speed. Moreover, Deep-STORM is parameter free, requiring no expertise from the user, and is easily implemented for any single-molecule dataset. Importantly, Deep-STORM is general and does not rely on any prior knowledge of the struc- ture in the sample, unlike recently demonstrated, single-shot image enhancement by Deep-Learning [36]. 2 Methods 2.1 Deep Learning In short, Deep-STORM utilizes an artiﬁcial neural net that re- ceives a set of frames of (possibly very dense) point emitters and outputs a set of super-resolved images (one per frame), based on prior training performed on simulated or experimentally ob- tained images with known emitter positions. The output images are then summed to produce a single super-resolved image. 2.1.1 Architecture The net-architecture is based on a fully convolutional encoder- decoder network and was inspired by previous work on cell counting [37]. The network (Figure 1) ﬁrst encodes the input intensity-image into a dense, aggregated feature-representation, through three 3 × 3 convolutional layers with increasing depth, interleaved with 2 × 2 max-pooling layers (Supplementary In- formation). The result is an encoded representation of the data. Afterwards, in the decoding stage, the spatial dimensions are restored to the size of the input image through three successive 1 arXiv:1801.09631v3 [physics.optics] 2 May 2018