“Digital Natives”: How Medical and Indigenous Histories Matter for Big Data by Joanna Radin* ABSTRACT This case considers the politics of reuse in the realm of “Big Data.” It focuses on the history of a particular collection of data, extracted and digitized from patient records made in the course of a longitudinal epidemiological study involving Indigenous members of the Gila River Indian Community Reservation in the American South- west. The creation and circulation of the Pima Indian Diabetes Dataset (PIDD) dem- onstrates the value of medical and Indigenous histories to the study of Big Data. By adapting the concept of the “digital native” itself for reuse, I argue that the history of the PIDD reveals how data becomes alienated from persons even as it reproduces complex social realities of the circumstances of its origin. In doing so, this history highlights otherwise obscured matters of ethics and politics that are relevant to com- munities who identify as Indigenous as well as those who do not. WHAT’S IN A NAME? Several years ago I found myself in conversation with a mathematician. She was an expert in a field of problem solving called machine learning. As she explained it to me, applications of her work served to do things like optimize Google search rank or- ders, to make sure that people found what they were looking for or, perhaps, what they did not even know they were looking for. At the time of our conversation, she was us- ing her expertise to help the electricity provider Con Edison to predict fires sparking in the underground power grid in New York City. Such fires, in addition to disrupting service, could lead to dangerous explosions of manhole covers. 1 To address the chal- * Section of the History of Medicine, Sterling Hall of Medicine, Yale University, 333 Cedar Street, L132, New Haven, CT 06520; joanna.radin@yale.edu. I would like to thank Cynthia Rudin and David Aha for conversations about their machine learning work and Jennifer Brown, Laurel Waycott, Laura Stark, and participants in the “Big Data and Invis- ible Labor” symposia at the Max Planck Institute for the History of Science. q 2017 by The History of Science Society. All rights reserved. 0369-7827/11/2017-0003$10.00 1 Cynthia Rudin, “21st Century Data Miners Meet 19th Century Electrical Cables,” Computer 44 (2011): 103–5; Rudin, “Machine Learning for the New York City Power Grid,” IEEE Transactions on Pattern Analysis and Machine Intelligence 34 (2012): 328– 45. These circular, cast-iron caps cover holes constructed to provide repair workers access to the underground power grid when maintenance is required. For some data on the frequency and severity of fires beginning in manholes, see https://users .cs.duke.edu/~cynthia/docs/RudinEtAl2011ComputerMagazine.pdf (accessed 26 September 2014). OSIRIS 2017, 32 : 43–64 43