1 Web Clipping: Compression Heuristics for Displaying Text on a PDA PEDRO GOMES, SÉRGIO TOSTÃO, DANIEL GONÇALVES, JOAQUIM JORGE Instituto Superior Técnico Av. Rovisco Pais, 1049-001 Lisboa djvg@gia.ist.utl.pt , jorgej@acm.org + 351 917935280, + 351 937021798, +351 21841769 Abstract. While there is strong motivation to do so, reading Web pages on portable devices still leaves much to be desired. Most solutions need special versions of Web sites and do not cope adequately with resource limits of PDAs, in particular their small screen, which makes it difficult to display large amounts of information in a usable manner. We describe an approach that provides greater flexibility to users in selecting information they want displayed, while coping with display limitations by analyzing content and organizing it into abstract visualization levels. The user can zoom in and out successive levels of detail and navigate the content without being overwhelmed by clutter. We have developed heuristics for filtering text though task analysis and usability studies, whose results are described here. These studies provided meaningful insights to help us explore trade-offs between information filtering (and therefore text compression) and text comprehension. Keywords: Zoomable Interfaces, Web-Clipping, and Morphological Text Analysis 1. Introduction The use of mobile computing devices, such as Personal Digital Assistants (PDAs) is becoming widespread. These devices usually have a small screen and reduced storage and processing capacities, where the keyboard and mouse are replaced by a pen and direct manipulation of objects on-screen. Given the ever-increasing amount of information available on the World-Wide Web, there is an increased desire for reading web documents on a PDA. However, most web documents aren’t designed to cope with the limitations of those devices. Hence, it is not usually possible to read them on portable devices without some kind of transformation. Some solutions have been tried to overcome those limitations. They usually require an alternate, trimmed-down, version of the documents to be prepared beforehand. Some of the most popular solutions, such as Web-Clipping, developed by Palm, Inc [1], or AvantGo (http://www.avantgo.com) do so. This is undesirable because it involves an increased effort in creating and maintaining alternate versions of a site, and because only prepared sites can be read. Also, it doesn’t deal with the problem of having a small screen, where only a few lines of text can be shown at the same time. Long documents might become too cumbersome to read in such a fashion. A questionnaire made to 30 Internet and PDA users about popular PDA Web- reading solutions confirms this. The most popular readers mentioned were AvantGo, iSilo and SmartDoc. Most persons mentioned that they allow “Web page off-line visualization, frequently updated”, and are “Fast, Simple and Useful