SEPTEMBER/OCTOBER 2006 1541-1672/06/$20.00 © 2006 IEEE 33 Published by the IEEE Computer Society I n t e r a c t i v e E n t e r t a i n m e n t Technologies That Make You Smile: Adding Humor to Text- Based Applications Rada Mihalcea, University of North Texas Carlo Strapparava, Istituto per la ricerca scientifica e Tecnologica N atural language’s creative genres are traditionally considered to be outside the scope of computational modeling. Computational linguists have paid little atten- tion to humor in particular because it is puzzling by nature. However, given the impor- tance of humor in our daily lives and computers in our work and entertainment, studies related to computational humor will become increas- ingly significant in fields such as human-computer interaction, intelligent interactive entertainment, and computer-assisted education. Previous work in computational humor has focused mainly on humor generation, 1,2 and little research has addressed developing systems for auto- matic humor recognition 3 (see the “Related Work on Computational Humor” sidebar). This is not sur- prising because, computationally, humor recogni- tion appears to be significantly more subtle and dif- ficult than humor generation. Moreover, the absence of very large collections of humorous texts has hin- dered the development of systems that use humor in text-based applications. Consequently, few such sys- tems are available. In this article, we explore computational ap- proaches’ applicability to the recognition and use of verbally expressed humor. Particularly, we focus on three important research questions related to this prob- lem: Can we automatically gather large collections of humorous texts? Can we automatically recognize humor in text? And can we automatically insert humorous add-ons into existing applications? One-liners versus long jokes Because a deep comprehension of all humor styles is probably too ambitious for existing computational capabilities, we restricted our investigation to one- liners. A one-liner is a short sentence with comic effects and an interesting linguistic structure: sim- ple syntax, deliberate use of rhetoric devices (such as alliteration or rhyme), and frequent use of creative language constructions meant to attract the reader’s attention. For instance, “I’m not a vegetarian because I love animals, I’m a vegetarian because I hate plants” is an example of a one-liner. Although longer jokes can have a relatively com- plex narrative structure, a one-liner must produce the humorous effect in one shot, with few words. This makes one-liners particularly suitable for automatic learning settings because the humor-producing fea- tures are guaranteed to be present in the first (and only) sentence. Web-based bootstrapping of humorous one-liners Large amounts of training data can potentially make the learning process more accurate and at the same time provide insights into how increasingly larger data sets can affect classification precision. However, we found that manually constructing a very large one-liner data set was problematic because most Web sites or mailing lists that had such jokes did not list more than 50 to 100 one-liners. To tackle this problem, we implemented a Web-based boot- strapping algorithm that could collect numerous one- liners starting with a short seed list, consisting of a few manually identified one-liners. Figure 1 illustrates the bootstrapping process. Starting with the seed set, the algorithm automati- cally identifies a list of Web pages that include at Humor is essential for interpersonal communication, but research often neglects the topic. Computational approaches can be successfully applied to the recognition and use of verbally expressed humor.