Repurposing: Another R to Prioritize in Data Curation Michael Twidale School of Information Sciences, University of Illinois at Urbana-Champaign, Champaign, Illinois, USA twidale@illinois.edu Catherine Blake School of Information Sciences, University of Illinois at Urbana-Champaign, Champaign, Illinois, USA clblake@illinois.edu Maria Souden US Department of Veterans Affairs VA Information Resource Center (VIReC), Edward Hines, Jr. VA Hospital, Hines, Illinois, USA maria.souden@va.gov Jenifer Stelmack US Department of Veterans Affairs VA Information Resource Center (VIReC), Edward Hines, Jr. VA Hospital, Hines, Illinois, USA jenifer.stelmack@va.gov Jenna Kim School of Information Sciences, University of Illinois at Urbana-Champaign, Champaign, Illinois, USA jkim682@illinois.edu ABSTRACT Data repurposing has great potential for innovative research. But there are distinct risks of researchers outside the group who originally collected the data misunderstanding the actual meaning of particular ﬁelds and values. These meanings may not be sufﬁciently well documented, as those within the col- lecting group know what is really meant in the context of the original purposes of collection. KEYWORDS Reuse; repurposing; metadata; tacit knowledge ASIS&T THESAURUS Information ﬂow; Information loss INTRODUCTION Various researchers have proposed terms important in data use including rerun, repeat, replicate, reproduce, reuse and replicate, e.g. [Benureau & Rougier, 2017]. We propose another R: repurpose - a special kind of data reuse for a reason other than why it was originally collected. Data repurposing has great potential for innovative research, but there are risks of misunderstanding the meaning of particular ﬁelds and values. As part of a larger study [Blake, Souden, Anderson, Twidale, & Stelmack, 2015] we interviewed 18 researchers about their experiences of technical help-giving. Here we report issues arising from the reuse of data collected for clin- ical purposes being repurposed for research. RESULTS “… the data don’t necessarily mean what you think they mean. I think a lot of the researchers forget, or are not aware, or didn’t even think of ways in which data that are collected for the primary purpose of taking care of a patient, may not ﬁt snugly with their research agenda.” All groups use technical terms, jargon and specialist meanings of words. That enables efﬁcient communication within the group, but can cause confusion for outsiders. The obvious case is where a group uses special terms and acronyms. But in that case the person outside the group is at least aware that they do not understand something and can try to ﬁnd out what is actu- ally meant. It is more problematic when a word or phrase is used in a way that has a particular meaning within the group but a subtly (or not so subtly) different meaning outside. In that case the person outside the group may not be aware that there is a different meaning and so may misinterpret the data. This is especially problematic in datasets. There is an understandable need to use rather terse terms for ﬁeld names and values, and this terseness can hide subtle qualiﬁcations about the actual meaning of the ﬁeld or what was actually collected. Problems can occur even when two groups are very closely related. In our case; clinicians who had collected the data for the purposes of treating patients, and medical researchers doing subsequent data analysis, often longitudinal. This was shown by a researcher who happened to be in both groups: “I have operational privileges, which means that I learn tremendously from the operation work. And sometimes, I feel, I feel bad for those that are strictly on the research side. Sometimes because they’re trying to look through a keyhole. They don’t necessarily understand the data. … 82nd Annual Meeting of the Association for Information Science & Technology | Melbourne, Australia | 19–23 October, 2019 Author(s) retain copyright, but ASIS&T receives an exclusive publication license DOI: 10.1002/pra2.00176 ASIS&T Annual Meeting 2019 785 Posters