2556 IEEE SENSORS JOURNAL, VOL. 11, NO. 10, OCTOBER 2011 Agent Identiﬁcation Using a Sparse Bayesian Model Huiping Duan, Hongbin Li, Senior Member, IEEE, Jing Xie, Nicolai S. Panikov, and Hong-Liang Cui, Member, IEEE Abstract—Identifying agents in a linear mixture is a fun- damental problem in spectral sensing applications including chemical and biological agent identiﬁcation. In general, the size of the spectral signature library is usually much larger than the number of agents really present. Based on this fact, the sparsity of the mixing coefﬁcient vector can be utilized to help improve the identiﬁcation performance. In this paper, we propose a new agent identiﬁcation method by using a sparse Bayesian model. The proposed iterative algorithm takes into account the nonnegativity of the abundance fractions and is proved to be convergent. Numer- ical studies with a set of ultraviolet (UV) to infrared (IR) spectra are carried out for demonstration. The effect of the signature mismatch is also studied using a group of terahertz (THz) spectra. Index Terms—Agent identiﬁcation, false alarm, linear mixture, mismatch, signature, sparse Bayesian model, spectral sensing. I. INTRODUCTION D ETECTION and identiﬁcation of components from mix- tures have been studied in various ﬁelds and applications, such as blind source separation for speech recognition [1], spec- tral unmixing in hyperspectral sensing [2], [3], agent detection with Raman spectroscopy [4], or ﬂuctuation enhanced sensing (FES) [5], [6], and so on. Classical detection algorithms, such as the Matched Subspace Detector (MSD) [7], require the knowl- edge of the noise and interference characteristics in terms of their probability density functions (pdfs) or statistics estimated from the received data. However, in many practical scenarios, the background information is often unavailable or difﬁcult to estimate, making these detection methods inapplicable. Canon- ical correlation analysis [8] and the non-negative constrained least squares (NCLS) algorithm [5], [6] have been proposed for detection without using a priori knowledge of the background interference. Due to its simplicity, the linear mixture model is often as- sumed in identiﬁcation problems [2], [5], [6], [8] (1) Manuscript received January 25, 2011; accepted March 03, 2011. Date of publication March 22, 2011; date of current version August 24, 2011. The asso- ciate editor coordinating the review of this paper and approving it for publication was Prof. Kiseon Kim. H. Duan and J. Xie are with L. C. Pegasus Corporation, Hillside, NJ 07205 USA (e-mail: duan0002@ntu.edu.sg; jingxie@lcpegasus.com). H. Li is with Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, NJ 07030 USA (e-mail: hli@stevens.edu). N. S. Panikov is with the Department of Biology, Northeastern University, Boston, MA 02115 USA (e-mail: n.panikov@neu.edu). H.-L. Cui is with the Department of Physics, Polytechnic Institute of New York University, Brooklyn, NY 11201 USA (e-mail: hcui@poly.edu). Color versions of one or more of the ﬁgures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identiﬁer 10.1109/JSEN.2011.2130521 where is a vector containing observations to be ana- lyzed, is a matrix with each column representing the spectral signature of a possible target, is the mixing coefﬁcient vector consisting of the concentration ratio or abun- dance fraction of each component in the mixture, and is the noise standing for the measurement or modeling error. The objective of identiﬁcation is to determine what components are in the mixture and, furthermore, estimate the components’ abundance fractions. To solve the agent identiﬁcation problem, a signature library is often employed that includes signatures of all possible tar- gets. Usually the number of signatures in the library is signif- icantly larger than that composing the real mixture. Based on this fact, we propose to exploit the sparsity of the mixing co- efﬁcient vector to help improve the identiﬁcation performance. The iterative algorithm considers the non-negativity of the abun- dance fractions and is proved to be convergent. Using a sparse Bayesian model, our proposed method obtains an estimate of the abundance fraction vector with only a few nonzero en- tries, and the zero entries of the estimate can be interpreted as the absence of certain agents. Therefore, the mission of agent identiﬁcation is accomplished by parameter estimation and con- ducted in a quite simple and direct way. Meanwhile, conven- tional agent identiﬁcation methods [5], [6], [8] yield an estimate of that is generally not sparse. They have to select and apply a threshold on the estimate to decide if a component is present or not. The choice of the threshold is difﬁcult due to lack of the- oretical analysis. Experiments using a set of ultraviolet (UV) to infrared (IR) spectra are carried out. Results show excellent estimation accu- racy of the abundance fractions and good identiﬁcation perfor- mance. In addition, the inﬂuence of signature mismatch is also studied. The terahertz (THz) spectra of a group of bacteria are used for demonstration. This paper is organized as follows. In Section II, the agent identiﬁcation scheme using a sparse Bayesian model is de- scribed, along with a discussion on how to address the issue of signature mismatch. Section III contains the numerical results. Conclusions are drawn in Section IV. II. PROPOSED AGENT IDENTIFICATION SCHEME Since the size of the signature library is much larger than the number of agents present in the real sample, the true abun- dance fraction vector is a sparse vector with many zero en- tries. By assuming as a random variable with some prior pdf and incorporating the sparsity constraint into the estimator, a Bayesian approach can be developed with improved estimation accuracy over conventional methods. In the following, we solve the sparse abundance fraction vector in the Bayesian frame- work, which has been investigated extensively in the machine 1530-437X/$26.00 © 2011 IEEE