Smartphone-Based Public Health Information Systems: Anonymity, Privacy and Intervention Andrew Clarke Discipline of Health Informatics, University of Sydney, Sydney, NSW 2006, Australia. E-mail: andrew.clarke@sydney.edu.au Robert Steele Division of Health Informatics, Medical University of South Carolina, Charleston, SC, 29425, USA. E-mail: steelerj@musc.edu The pervasive availability of smartphones and their con- nected external sensors or wearable devices can provide a new public health data collection capability. Current research and commercial efforts have concen- trated on sensor-based collection of health data for per- sonal fitness and healthcare feedback purposes. However, to date there has not been a detailed investi- gation of how such smartphones and sensors can be utilized for public health data collection purposes. Public health data have the characteristic of being cap- turable while still not infringing upon privacy, as the full detailed data of individuals are not needed but rather only anonymized, aggregate, de-identified, and non- unique data for an individual. For example, rather than details of physical activity including specific route, just total caloric burn over a week or month could be submitted, thereby strongly assisting non-re- identification. In this paper we introduce, prototype, and evaluate a new type of public health information system to provide aggregate population health data capture and public health intervention capabilities via utilizing smart- phone and sensor capabilities, while fully maintaining the anonymity and privacy of each individual. We con- sider in particular the key aspects of privacy, anonymity, and intervention capabilities of these emerging systems and provide a detailed evaluation of anonymity preser- vation characteristics. Introduction The rapid growth in both the capabilities and uptake of smartphones suitable to act as health sensor platforms has the potential to advance public health data collection and intervention in significant ways. Although, increasingly, research and development is concentrating on how mobile devices and sensors can be used as a tool for individual health data capture and feedback, this has not extended into inves- tigation of how these devices can be used for public health data capture. Interestingly, the case for public health usage does not require the same level of precise data that would often be required in participatory sensing (Burke et al., 2006) applications in other domains. For example, the exact loca- tion and time of a measured sensor value is less important than the aggregate value over a period of time or the trend or change for a community as a whole. This article is a significantly extended version of a pre- vious conference work (Clarke & Steele, 2014). In particular this article differs in that it analyzes these novel smartphone- based public health information systems as a generic new type of system, describes the results from building a signifi- cant prototype system and carries out a substantially more detailed privacy and anonymity analysis. We describe a class of smartphone-based information systems for anonymized public health data capture and intervention. Interventions (Klasnja & Pratt, 2012) in this work are in the form of informational messages sent to an individual’s smartphone, intended to create a health-related behavioral change, and are a key component of future Health Participatory Sensing Networks (HPSNs). In particular, as we later describe, a significant new capability enabled by these systems is that a targeted public health intervention can be distributed, per- formed, and evaluated without the need for the identifying details of an individual to ever leave their mobile device. The introduced system eschews the need for a fully trusted central server, which might prove impractical or itself a significant privacy risk on population-scale applications, instead adopting an architecture that has a central aggregation server in communication with the end-user mobile devices, only via an intervening anonymizing layer, and uses local processing on each mobile device to ensure nonre- identifiability of the user from their submitted sensor data. Received March 6, 2014; revised May 28, 2014; accepted May 29, 2014 © 2015 ASIS&T Published online in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/asi.23356 JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, ••(••):••–••, 2015 V C 2015 ASIS&T Published online 2 April 2015 in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/asi.23356 JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 66(12):2596–2608, 2015