Web Science and the Two (Hundred) Cultures: Representation of Disciplines Publishing in Web Science Clare J. Hooper IT Innovation Centre, University of Southampton, UK cjh@it-innovation.soton.ac.uk Georgeta Bordea Digital Enterprise Research Institute, NUI, Galway bordea.georgeta@deri.og Paul Buitelaar Digital Enterprise Research Institute, NUI, Galway buitelaar.paul@deri.org ABSTRACT Web Science is an interdisciplinary field. Motivated by the unforeseen scale and impact of the web, it addresses web- related research questions in a holistic manner, incorporating epistemologies from a broad set of disciplines. There has been ongoing discussion about which disciplines are more or less present in the community, and about defining Web Science itself: there is, however, a dearth of empirical work in this area. This paper presents an analysis of the presence of different disciplines in Web Science. We applied Natural Language Processing and topic extraction to a corpus of Web Science material, analysing it with graphing and visualisation tools, MatLab and an expert survey. We discovered four communities within Web Science, and trends in the conference series over time (a strong impact from collocation) and format (posters covering a broader range of topics than papers). The expert survey linked highly ranked terms with disciplines, yielding strong links with Communication, Computer Science, Psychology, and Sociology. Controversially, experts described highly ranked topics and suggested disciplines (extracted from WebSci CFPs) as not reflecting the nature of Web Science. Author Keywords Web Science discipline; community analysis; bibliometrics; disciplines; Saffron. ACM Classification Keywords K.4.m. Computers and Society: Miscellaneous. General Terms Human Factors; Measurement; Theory. INTRODUCTION In 1959, C. P. Snow famously gave his lecture, ‘The Two Cultures’ [22], in which he lamented the division of the sciences and the humanities, and the negative impact of that division upon intellectual progress across society. Web Science, like certain other disciplines, is at its heart radically interdisciplinary… or is it? There has been ongoing discussion about the representation of various disciplines within the Web Science community. Forming a stable, diverse community is no small task: members of the Web Science Trust have worked to try and ensure that the community is balanced with a rich variety of well represented disciplines, and not dominated by one field such as Computer Science. Defining Web Science can be difficult. Tools to describe the field include the ‘Web Science butterfly’ diagram, used early in the life of Web Science to convey the vision [18], but this diagram is a vision rather than an accurate depiction of the state of the field [12]. Similar, the Web Science Subject Categorisation [23] only offers a vision and structure, not information on subjects’ prevalence within the community. This is problematic. Understanding the actual presence (measured by publications) of different disciplines within Web Science offers several advantages, letting us: better communicate what work is done under the WebSci flag; ground dialogue about Web Science diversity and disciplinary representation with data, identifying under- and over-represented disciplines, and absent disciplines; identify problems that need addressing, and take action by seeking collaborations and communities that would remediate current weaknesses within Web Science. One paper at WebSci’12 began to examine this area, proposing a methodology and presenting early results that were yielded by this methodology. (The next section details differences between that work and this.) We build on that work, drawing on a corpus of papers from past Web Science conference proceedings, journal.webscience.org, and other sources. We used Natural Language Processing to extract terms from these, and conducted a network analysis of the resultant materials (which we have made available online, with links to the corpus 1 ). This paper presents an analysis and discussion concerning: 1. Communities found within the corpus 2. Changes in the Web Science conference series over time 3. Changes in Web Science conference publications according to format 1 See: clarehooper.net/WebSciCorpus Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. WebSci’13, May 2–4, 2013, Paris, France. Copyright 2013 ACM 978-1-4503-1889-1...$10.00.