194 International Journal on Advances in Intelligent Systems, vol 8 no 1 & 2, year 2015, http://www.iariajournals.org/intelligent_systems/ 2015, © Copyright by authors, Published under agreement with IARIA - www.iaria.org Foundations of Semantic Television Design of a Distributed and Gesture-Based Television System Simon Bergweiler and Matthieu Deru German Research Center for Artificial Intelligence (DFKI) Saarbr¨ ucken, Germany Email: firstname.lastname@dfki.de Abstract—The innovations in information and communication technologies change our daily life and the way how to inter- act with intelligent systems. Powerful computers are becoming smaller and are integrated almost anywhere, even in televisions. Today’s connected television systems are offering a lot of technical functionalities including these, which are currently integrated in smartphones. In this article, we describe an innovative ap- proach in form of an intelligent television system named Swoozy, which enables viewers to discover extended information, such as facts, images, shopping recommendations or video clips about the currently broadcast TV program by using the power of technologies of the Internet and the Semantic Web. Via a gesture- based user interface viewers will get answers to questions they may ask themselves during a movie or TV report directly on their television. These questions are very often related to the name and vita of the featured actor, the place where a scene was filmed, or purchasable books and items about the topic of the report the viewer is watching. Furthermore, a new interaction concept for TVs is proposed using semantic annotations called Grabbables that are displayed on top of the videos and that provide a semantic referencing between the videos’ content and an ontological representation to access Semantic Web Services. Index Termsinteractive television system; Semantic Web Tech- nologies; video annotation; gesture-based interaction. I. I NTRODUCTION With the growing popularity of smartphone applications (apps) a new trend slowly appeared to integrate these capabil- ities into television systems. In fact, the so-called connected television systems provide a wide range of technical capabilities that opens the viewers new possibilities to communicate and interact with the Internet and its services with similar features their smartphones would currently provide. This article describes an innovative approach in form of an intelligent television system named Swoozy [1]. This self-designed and implemented system enables viewers to discover extended information, such as facts, images, shopping recommendations or video clips about the currently broadcast TV program by using the power of technologies of the Internet and the Semantic Web. A study conducted by the German marketer for audiovisual media SevenOneMedia [2] reveals that in a viewer panel aged between 14 and 29, 45 % of them are surfing in parallel of watching television and that the main purpose of this browsing activity is to find out more information about the program, e.g., an actor’s name or biography, a location or a depicted product. This search is likely done by either using a mobile or TV app or by proactively typing in a keyword or complete phrase in a Web search engine. The current development trend in interactive connected television systems is very app-oriented: users must install a lot of single apps, for example, one for searching videos another one for images in order to get the information they are looking for. Another technology widely spread in Europe is the Hybrid Broadcast Broadband TV standard (HbbTV) that certainly offers viewers an alternative to apps, but is currently still limited in interaction and search possibilities. These trends and technologies are described in detail in Section III. The usage of these solutions also reveals another problem: the constant switches between several apps will oblige the user to leave his TV program and to interact several times with his remote controller before finally getting the information he was looking for. To solve these interaction issues, the discussed approach presents a new way how viewers can interact with additional content while watching a TV program. In fact, with our solution, they are able to search in parallel for information in the Web and easily browse through the found results without an interaction breach. In its first version, the developed prototype system relies on semantic annotations gained out of the analysis of a broadcasted video combined with gesture-based interactions that will enable users to directly start a search in the Web using Semantic Web technologies, to get precise additional information in relation to the current shown scenery, like further videos, text or news articles, pictures, and furthermore shopping recommendations. Whereas system prototypes like NoTube [3] and others [4][5][6] are using the Semantic Web for detecting possible matches between the watched program and other Web-based contents to only offer a personalized TV access, our approach uses semantic technologies on several levels. The first level is the extraction of knowledge and concepts from an ordinary non pre-annotated Digital Video Broadcasting (DVB) data stream (also called video signal). From this DVB data stream, the required information is extracted and transferred via matching rules into annotations. Over an intuitive dedicated gesture-based graphical TV interface, presented in Section V, the viewer can easily trigger a search using semantic queries. These queries are finally processed by a specially designed and implemented