10 Informatica Economică vol. 19, no. 1/2015 DOI: 10.12948/issn14531305/19.1.2015.01 Ontology-based Integration of Web Navigation for Dynamic User Profiling Anett HOPPE, Ana ROXIN, Christophe NICOLLE CheckSem Research Group, Laboratoire Electronique, Informatique et Images (LE2I) Université de Bourgogne, 21068 Dijon, France {anett.hoppe, ana.roxin, christophe.nicolle}@checksem.fr The development of technology for handling information on a Big Data-scale is a buzzing top- ic of current research. Indeed, improved techniques for knowledge discovery are crucial for scientific and economic exploitation of large-scale raw data. In research collaboration with an industrial actor, we explore the applicability of ontology-based knowledge extraction and representation for today's biggest source of large-scale data, the Web. The goal is to develop a profiling application, based on the implicit information that every user leaves while navi- gating the online, with the goal to identify and model preferences and interests in a detailed user profile. This includes the identification of current tendencies as well as the prediction of possible future interests, as far as they are deducible from the collected browsing infor- mation, and integrated expert domain knowledge. The article at hand gives an overview on the current state of the research, the developments made and insights gained. Keywords: Semantic Web, Ontologies, SWRL, Big Data reasoning Introduction "Big Data" is one of the big buzzwords of our time – culminating in the creation of var- ious congresses and conferences focusing on only that topic during the recent years (e.g. IEEE Congress on Big Data, starting from 2011). The handling of immense amounts of data brings scientists and analysts in a di- lemma: On the one hand, using sophisticated analysis techniques might bring best results, but usually come with a higher processing complexity and time that is just not tolerable for most applications. On the other hand, methods known for their efficiency may fail to exploit the data sources in all their depth. Several research works proposed distinct cri- teria to define the nature of "Big Data" (e.g. [1]). The definition largely converges towards the following five: volume: massive amounts of data have to be treated, velocity: those data arrive in high speed, variety: data types and formats are heter- ogeneous, veracity: data are not always sound and have to be verified, value: they have an inherent value that has to be discovered by the application. Applications acting in a Big Data context have to handle all of them in an efficient manner, balancing analysis depth and per- formance time. For that very reason, the application of se- mantic technology is often discarded for a Big Data context. Semantic analysis seems too complex, too costly to be affordable in an environment in which often already very ef- ficient techniques do not come up to the per- formance necessities. We want to make a case for ontology-based knowledge represen- tation, even when handling vast data amounts. By employing an ontology that has been customised for the application domain to the very detail, the information is limited to those bits and bytes that are actually rele- vant. Furthermore, we make an effort to avoid performance issues, by decoupling costly analysis steps from the actual, real- time user profiling process (please refer to Section 0 for details). Furthermore, costly analysis steps have been decoupled from the final system purpose to avoid performance issues. We demonstrate this approach based on an application in digital advertising. Publishers nowadays have detailed information about their user's navigation behaviour: servers 1