SIViP DOI 10.1007/s11760-017-1155-y ORIGINAL PAPER Block-based image fusion using multi-scale analysis to enhance depth of ﬁeld and dynamic range Vishal Chaudhary 1 · Vinay Kumar 1 Received: 12 February 2017 / Revised: 21 June 2017 / Accepted: 26 July 2017 © Springer-Verlag London Ltd. 2017 Abstract A novel technique for integrating information by exploring multi-scale positions with block-based fusion and to address blocking effects is discussed in the present manuscript. The source images are split into local and global layers using neighbor distance ﬁlter by extracting informa- tion at multi-scale positions. Recombined local and global layers are constructed using block-based and weighted aver- age methods, respectively. The spatial frequency as well as exposedness factor is used to ﬁnd the texture information and exposure level for respective blocks. Resulting local and global layers are then fused to generate ﬁnal fused image. The method is applicable to any number of source images. Extensive experimental results are provided to show the effectiveness of proposed technique. Keywords Depth of ﬁeld (DoF) · Image fusion · Multi-focus image · Multi-exposure image · Shutter speed 1 Introduction Image fusion [1] is a process of collecting complemen- tary information from number of captured images (source images) of same scene and combine them into a single image, refer Fig. 1. There are wide range of applications of image fusion right from defense to medical systems [2– 6]. Image fusion can be categorized in number of ways Electronic supplementary material The online version of this article (doi:10.1007/s11760-017-1155-y) contains supplementary material, which is available to authorized users. B Vishal Chaudhary vishalch.nsit@gmail.com 1 ECED, Thapar University, Patiala, India based on image capturing conditions: multi-focus, multi- exposure, multi-view and multi-modal image fusion. Present manuscript utilizes multi-focus and multi-exposure methods to combine entire information into single image. Varying shutter speed captures diverse details of scene as source images [7]. To get an image with optimum detail, com- plementary information is combined from captured images. The process is known as multi-exposure image fusion. Fig- ure 1 shows the images with different exposure times and corresponding fused image. Lens is another device feature which directs the infor- mation from scene to sensor. It maintain sharpness in some sections while blurs remaining, the process is deﬁned as depth of ﬁeld (DoF) [8, 9]. Figure 2 shows images with limited DoF and fused image. To improve the DoF, source images with diverse focused regions are combined and the process is known as multi-focus image fusion. The techniques available in literature for multi-focus image fusion are broadly classiﬁed as spatial and transform domain techniques. In spatial domain, operations are directly performed on pixels. The simplest way to perform this is by averaging intensities of source images on pixel to pixel basis. De et al. [8] proposed a method using mathematical morphology on individual pixels. Bai et al. [10] used morphology based top-hat transform to perform fusion on pixel to pixel basis. These methods are prone to contrast reduction and sensitive to misregistration. To overcome these problems, researchers have adopted block-based fusion, where blocks with highest information measure among source images are combined to construct fused image. Li et al. [11] proposed block-based fusion based on spatial frequency components. Li et al. [12] trained neural networks to select the best block using spa- tial frequency, visibility and edge. Miao and Wang [13] and Huang and Jing [14] used pulse coupled neural network, 123