Poster Abstract: Statistical En-route Filtering in Large Scale Sensor Networks * Fan Ye, Haiyun Luo, Songwu Lu, Lixia Zhang UCLA Computer Science Department Los Angeles, CA 900095-1596 {yefan,hluo,slu,lixia}@cs.ucla.edu Categories and Subject Descriptors C.2.1 [Computer Systems Organization]: Computer- Communication Networks—Network Architecture and De- sign: Wireless Communication General Terms Security, Design Keywords compromised nodes, false data, security, sensor networks 1. INTRODUCTION In large-scale sensor networks serving mission-critical ap- plications, one possible denial-of-service attack is false data reports injected by attackers. Such false reports could lead to false alarms, exhaustion of the en-route sensors’ limited battery energy, and congestion of wireless channels with lim- ited bandwidth. Although recent work on sensor message authentication [2, 1] can eﬀectively block out false data in- jections from external nodes, they are rendered ineﬀective by compromised nodes which can authenticate themselves to neighbors and correctly encrypt false messages. Such at- tacks through compromised nodes are possible because sen- sor networks are usually un-attended. An attacker can phys- ically capture a sensor and obtain the security information stored in it without being detected. We propose Statistical En-route Filtering (SEF) that can ﬁlter out such false reports en-route as they are forwarded toward the data collection point (called “sink”). SEF lever- ages the scale of the sensor network and high density level in sensor node deployment. In order to diﬀerentiate false data reports injected by compromised nodes, SEF relies on the collective eﬀorts of both the sensors surrounding the report generation locations, and the sensors along data delivery paths. Speciﬁcally, when an actual sensing target (called “stim- ulus”) occurs in the ﬁeld, SEF lets multiple surrounding sensors collectively generate a legitimate report that car- ries multiple keyed message authentication codes (MACs). These MACs are the “passport” for the report as it traverses * This work is supported by DARPA under contract DABT63-99-1-0010 Copyright is held by the author/owner. SenSys’03, November 5–7, 2003, Los Angeles, California, USA. ACM 1-58113-707-9/03/0011. the sensor network towards the sink. A report with less than a threshold number of MACs will be dropped. Through proper key assignments each node can only generate one le- gitimate MAC for each report. An attacker who captured a small number of sensor nodes has to forge incorrect MACs to inject a seemingly legitimate report. SEF lets sensor nodes share keys probabilistically to en- able statistical en-route veriﬁcation of a report’s MACs. Any forwarding node has certain probability of possessing one of the keys used in generating these MACs. A data report is dropped immediately upon the detection of any incorrect MAC. As more and more intermediate sensor nodes forward a data report, the probability of detecting incorrect MACs increases. Finally the sink veriﬁes all the MACs of each re- ceived data report and ﬁlters out those false reports that escape the statistical en-route ﬁltering. SEF only uses computationally eﬃcient one-way hash func- tions to conserve the computation resources of small sensor nodes. To minimize the communication overhead and the corresponding energy consumption, we uses Bloom ﬁlter to compress the MACs while retaining en-route veriﬁcation of the MACs. Through analysis and extensive simulations, we show that with an overhead of 14 bytes per report, SEF is able to drop 80∼90% false reports injected through a com- promised node within 10 hops. 2. DESIGN SEF consists of three pieces: 1) key assignment to sensor nodes for MAC generation; 2) en-route veriﬁcation of MACs to ﬁlter false data reports; 3) sink veriﬁcation of each MAC to detect false reports that escape the en-route ﬁltering. 2.1 Key Assignment and Report Generation Nodes share keys to a certain degree to enable en-route veriﬁcation of MACs; but the sharing is also constrained to prevent one compromised node from generating all the MACs required in a legitimate report. To this end, we use a global key pool of N keys, divided into n non-overlapping partitions with m = N/n keys each. Each key has a unique key index. Before a sensor node is deployed, we load it with k (k<m) keys and their indices, randomly chosen from one of the n partitions. That is, a sensor node only possesses keys from one single partition, but two nodes have certain probability of sharing keys be- cause they may pick keys from the same partition. Only the sink knows all the keys. When a stimulus appears, multiple nodes that detect it collaborate to process the signal and elect a Center-of-Stimulus