Draft – Do Not Cite 1 Mathematical Values in Data Science: Three Conjectures to Explain Mathwashing Patrick Allo patrick.allo@vub.be www.logicandinformation.be The overarching background against which this article is written is the development of a critical research agenda on contemporary data practices. In particular, I’m interested in the question of which critical epistemology we need to support our thinking about the ethical risks that are associated with data practices; from the increasing grip of data science on knowledge creation and decision-making in general, to the development of automated systems. 1 A dedicated epistemological engagement with data science is desirable for many reasons. It matters because many ethically relevant effects of our data-practices can be retraced to epistemological shortcomings (unrepresentative data, mistaken predictions, unwarranted confidence in the obtained insights, etc). A better understanding of the kind of knowledge that data science creates and the standards to which it should conform can help us to navigate these ethical challenges with more confidence; for instance, by making the applicable and desirable epistemic standards explicit. But that is only one of the motivations. Thinking through the interplay between ethics and epistemology also reveals that simply fixing the surface-level epistemological shortcomings (think: more and better data, higher predictive accuracy, etc) does not automatically address all ethical (Mittelstadt et al. 2016), or even all epistemological concerns. Many epistemological dimensions, some of which are associated with epistemological virtues (e.g. transparency, intelligibility, explanatory value) or with the socio-epistemic side of data-science (such as the absence of questionable information asymmetries), do not naturally fit in a narrow accuracy-centric account of data-science. Similarly, many ethical challenges such as those related to discrimination become more tractable once we are clearer about what is epistemologically feasible and desirable, but still need to be addressed from within the confines of an ethical normative framework. So conceived, getting a grip on the entanglement of ethics and epistemology is crucial to understand the division of labour between two complementary and intersecting normative frameworks, to ensure that all risks can be properly conceptualised and addressed, and to expose the limits of technical solutionism 1 in dealing with genuine ethical concerns. In the present article I want to approach these concerns at a more conceptual level. To begin with, I want to highlight that, despite the ease with which we evaluate many data-practices—call them 1 See Cath (2018: 3) and Lipton & Steinhardt (2018: §3.4).