Identity, Reference, and Meaning on the Web Harry Halpin Institute for Communicating and Collaborative Systems University of Edinburgh 2 Buccleuch Place Edinburgh, United Kingdom H.Halpin@ed.ac.uk ABSTRACT Problems of reference, identity, and meaning are becoming increasingly endemic on the Web. We focus first on the con- vergence between Web architecture and classical problems in philosophy, leading to the advent of “philosophical engineer- ing.” We survey how the Semantic Web initiative in partic- ular provoked an “identity crisis” for the Web due to its use of URIs for both “things” and web pages and the W3C’s proposed solution. The problem of reference is inspected in relation to both the direct object theory of reference of Rus- sell and the causal theory of reference of Kripke, and the proposed standards of new URN spaces and Published Sub- jects. Then we progress onto the problem of meaning in light of the Fregean slogan of the priority of meaning over refer- ence and the notion of logical interpretation. The popular notions of “social meaning” and the practice of tagging as a possible solution is analyzed in light of the ideas of Lewis on convention. Finally, we conclude that a full notion of meaning, identity, and reference may be possible, but that it is an open problem on how practical implementations and standards can be created. 1. PHILOSOPHICAL ENGINEERING While the Web epitomizes the beginning of a new digital era, it has also caused an untimely return of philosophical is- sues in identify, reference, and meaning. These questions are thought of as a “black hole” that has long puzzled philoso- phers and logicians. Up until now, there has been little incentive outside academic philosophy to solve these issues in any practical manner. Could there be any connection be- tween the fast-paced world of the Web and philosophers who dwell upon unsolvable questions? Yet in a surprising move, the next stage in the development of the Web seems to be signalling a return to the very same questions of identity, reference, and meaning that have troubled philosophers for so long. While the hypertext Web has skirted around these ques- tions, attempts at increasing the scope of the Web can not: “The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation” [41]. Meaning is a thorny word: do we define meaning as “machine-readable” or “has a relation to a for- mal model?” Or do we define meaning as “easily understood by humans,” or “somehow connected to the world in a ro- Copyright is held by the author/owner(s). WWW2006, May 22–26, 2006, Edinburgh, UK. . bust manner?” Further progress to create both satisfying and pragmatic solutions to these problems in the context of the Web is possible since currently many of these questions are left underspecified by current Web standards. While many in philosophy seem to be willing to hedge their bets in various ideological camps, on the Web there is a powerful urge to co-operate. There is a distinct difference between the classical posing of these questions in philosophy and these questions in the context of the Web, since the Web is a hu- man artifact. The inventor of the Web, Tim Berners-Lee, summarized this position: “We are not analyzing a world, we are building it. We are not experimental philosophers, we are philosophical engineers” [2]. 2. THE IDENTITY CRISIS OF URIS The first step in the creation of the Semantic Web was to extend the use of a URI (Uniform Resource Identifier) to identify not just web pages, but anything. This was his- torically always part of Berners-Lee’s vision, but only re- cently came to light with Semantic Web standardization ef- forts and has caused disagreement from some of the other original Web architects like Larry Masinter, co-author of the URI standard [4]. In contrast to past practice that gener- ally used URIs for web-pages, URIs could be given to things traditionally thought of as “not on the Web” such as con- cepts and people. The guiding example is that instead of just visiting Tim Berners-Lee’s web page to retrieve a rep- resentation of Tim Berners-Lee via http, you could use the Semantic Web to make statements about Tim himself, such as where he works or the color of his hair. Early proposals made a chasm across URIs, dividing them into URLs and URNs. URIs for web pages (documents) are URLs (Uni- form Resource Locators) that could use a scheme such as http to perform a “variety of opeations” on a resource[5]. In contrast, URNs (Uniform Resource Names) purposely avoided such access mechanisms in order to create “persis- tent, location-independent, resource identifiers” [29]. URNs were not widely adopted, perhaps due to their centralized nature that required explicitly registering them with IANA. In response, URLs were just called “URIs” and used not only for web-pages, but for things not on the web. Separate URN standards such as Masinter’s tdb URN space have been declared, but have not been widely adopted [27]. Instead, people use http in general to identify both web pages and things. There is one sensible solution to get a separate URI for the thing if one has a URI that currently serves a representa- tion of a thing, but one wishes to make statements about the