Providing multimodal context-sensitive services to mobile users ° Carmelo Ardito, °* Thomas Pederson, ° Maria Francesca Costabile, ° Rosa Lanzilotti ° Dipartimento di Informatica, Università di Bari, 70125 Bari, Italy * Department of Computing Science, Umeå University, SE-90187 Umeå, Sweden {ardito, pederson, costabile, lanzilotti}@di.uniba.it top@cs.umu.se Abstract. In this paper, we describe a framework designed to provide multimodal context-sensitive services to mobile users. This work is part of the CHAT project, which aims at developing a general-purpose infrastructure for multimodal situation- adaptive user assistance. We specifically describe two conceptual corner stones of the project: a) a multimodal interaction framework targeted at providing service access through different modalities for different real-world situations and for improving interaction with mobile devices in general, b) an “egocentric interaction” model for framing interaction with objects in the vicinity of a mobile user, including also other real-world and/or computational entities than the mobile device itself, ranging from computationally “stupid” everyday objects to more advanced interactive devices such as desktop PCs. The final section of the paper is devoted to open issues in the design of the CHAT infrastructure related to the topic of user assistance in intelligent environments. Keywords: Multimodal interfaces, mobile human-computer interaction. 1. Introduction The CHAT project ("Cultural Heritage fruition & e-learning applications of new Advanced (multimodal) Technologies") aims at developing a software infrastructure that provides services accessible through thin clients such as cellular phones or PDAs to be: a) adaptable to personal preferences of the user, with focus on the choice of interaction modalities; and b) adaptive to the physical-virtual context of the human actor carrying the device. In both cases, the proposed architecture should be open both for channeling interaction between services and user through the mobile device itself, as well as through available input and output facilities in the vicinity. Furthermore, real-world phenomena sensed by the device itself or indirectly through external sensor pools will be made available through the CHAT infrastructure as a resource for service developers to effectively design “intelligent” environments. Users are allowed to interact with the system using several input channel simultaneously, classified by W3C as simultaneous co-ordinated multimodality [7]. Empirical studies proposal targeting this kind of multimodality are described in [1, 2]. The architecture for supporting these kind of multimodal systems is more complex than traditional interactive systems, because we have to consider: a) parallel recognition