RITA RANA CHHIKARA et al: A FEATURE SELECTION TECHNIQUE FOR BLIND IMAGE STEGANALYSIS DOI 10.5013/IJSSST.a.16.04.02 2.1 ISSN:1473-804x online,1473-8031 print A Hybrid Feature Selection Technique based on Improved Discrete Firefly and Filter Approach for Blind Image Steganalysis Rita Rana Chhikara Dept of CSE/IT ITM University Gurgaon, Haryana ritachhikara@itmindia.edu Latika Singh Dept of CSE/IT ITM University Gurgaon, Haryana latikasingh@itmindia.edu Abstract— Feature Selection is a preprocessing technique with great significance in data mining applications that aims at reducing computational complexity and increase predictive capability of a learning system. This paper presents a new hybrid feature selection algorithm based on Discrete Firefly optimization technique with dynamic alpha and gamma parameters and t-test filter technique to improve detectability of hidden message for Blind Image Steganalysis. The experiments are conducted on important dataset of feature vectors extracted from frequency domain, Discrete Cosine Transformation and Discrete Wavelet Transformation domain of cover and stego images. The results from popular JPEG steganography algorithms nsF5, Outguess, PQ and JP Hide and Seek show that proposed method is able to identify sensitive features and reduce the feature set by 67% in DCT domain and 37% in DWT domain. The experiment analysis shows that these algorithms are most sensitive to Markov features from DCT domain and variance statistical moment from DWT domain. The results are compared with DPSO (Discrete Particle Swarm Optimization) and well known multivariate feature selection techniques. Keywords- Discrete Firefly Algorithm, Feature Selection, Steganalysis, t-Test, DCT, DWT I. INTRODUCTION Steganalysis is science of breaking steganography which is the science of embedding hidden messages in innocent looking cover documents such as text, images, audio, video files [1]. It forms an important area of digital forensics. Steganalysis is broadly categorized as Blind and Specific Steganalysis. Blind image steganalysis is able to detect hidden message irrespective of underlying embedding technique and is found to be more practical, while specific steganalysis is beneficial only for known steganography tools [2]. Steganalysis can be considered a pattern recognition problem with two classes. The performance of a classifier is dependent on two parameters; classifier and features extracted from images. Various methods have been provided in literature to improve performance of steganalysers by increasing feature space starting from 274 features given by Fridrich [3], which were further extended to 548 [4].CF*7850 compact rich model for DCT domain further extended to 48,600 [5]. Farid et al [6] presented 72 wavelet features based on CF and PDF moments that provide improved accuracy. Recently Han Zong et al have proposed 126 wavelet features [7] based on entropy, energy and combinations of PDF moments. Some of the features may not be relevant to classification and may degrade the performance of the classifier. The objective of applying feature selection for steganalysis is to reduce computational complexity and increase the classification accuracy. Feature selection methodology based on Mutual Information was proposed by Xia et al that improves the efficiency of learning system [8]. Different evolutionary algorithms based methods presented in literature are MBEGA based on Markov Blanket [9], Localized Generalization Error Model (L-GEM) [10], Genetic Algorithm (GA) based on higher order statistics[11], Particle Swarm Optimization Algorithm (PSO) [12] employing SVM and neural networks as classifier. All these feature selection techniques have been found to enhance performance of a classifier for blind image steganalysis. A novel metaheuristic Firefly algorithm was proposed by Yang [13]. It has been successfully employed for applications like flowshop scheduling problems to minimize the makespan [14]. Banati and Bajaj [15] combined rough set theory with firefly algorithm. Yang Xin-She proved that Firefly algorithm outperforms GA and PSO and GA in terms of efficiency and success rate [16]. In this paper we present a new hybrid Discrete Firefly algorithm (DFA) based wrapper technique with dynamic alpha parameter in combination with t-test filter feature selection algorithm to find the most relevant reduced subset of features. The aim of reducing feature space is to improve accuracy to classify unseen images as cover or stego and improve speed of the learning system. The proposed work is applied on images generated from four steganography tools nsF5, PQ, Outguess and JPHS. The features extracted are from DCT 274 feature vector as given by Fridrich [3] and DWT 72 feature vectors given by Farid [6]. The proposed algorithm provides insight in the statistical features which provide maximum information about underlying embedding