Narratives: A Visualization to Track Narrative Events as they Develop Danyel Fisher * , Aaron Hoff, George Robertson Microsoft Research Matthew Hurst Microsoft Live Labs ABSTRACT Analyzing unstructured text streams can be challenging. One popular approach is to isolate specific themes in the text, and to visualize the connections between them. Some existing systems, like ThemeRiver, provide a temporal view of changes in themes; other systems, like In-Spire, use clustering techniques to help an analyst identify the themes at a single point in time. Narratives combines both of these techniques; it uses a temporal axis to visualize ways that concepts have changed over time, and introduces several methods to explore how those concepts relate to each other. Narratives is designed to help the user place news stories in their historical and social context by understanding how the major topics associated with them have changed over time. Users can relate articles through time by examining the topical keywords that summarize a specific news event. By tracking the attention to a news article in the form of references in social media (such as weblogs), a user discovers both important events and measures the social relevance of these stories. KEYWORDS: blogs, events, trends, time series, topic detection and tracking. I1DEX TERMS: I.7.m [Document and Text Processing]: Miscellaneous. I.3.8 [Computer Graphics]: Applications. 1 INTRODUCTION A standing challenge in Visual Analytics research is the analysis of unstructured text streams, such as news stories and blog entries. There have been a wide variety of approaches to these problems, each of which has emphasized various aspects of the data. Natural-language processing approaches try to bring out the actors and events of the stories. Other approaches extract keywords, cluster concepts, or arrange stories and themes along timelines. One particularly interesting area of analysis is the news, in part because it has implications both for analysts and news readers. News stories are a relevant source of current information when taken one at a time; en masse, they become a reflection of culturally-important information. Yet examining individual news stories, or even groups of stories, loses out on important aspects of news context. Reading an article in the paper gives little information about several critical, related areas: how the topic of that article has changed over time, and how readers are reacting to the article. As readers of the news, we are interested in the evolution of stories. What we might call the “narrative” around certain themes is shaped by the appearance of articles, and evolves over time: a company releases a new product, and is featured in the news; a presidential candidate enters a race, competes, and weathers scandals. All of these separate stories come together in a unified narrative of the candidate’s trajectory. The reactions of readers to the news also help us understand the context of the information we are reading.. One of the most accessible sorts of responses to news can be found in blogs, which have recently gained prominence within the VAST research community (and many others): the 2007 Contest, for instance, leveraged blogs as a critical portion of the solution. In this paper, we present Narratives (Figure 1). Narratives presents a way to view temporally-changing data. It works from a corpus of blog entries that talk about news stories, and so both reflects the articles about a topic and the blogs that comment on these articles. Despite its fairly simple visualization technique, based around a line graph, Narratives allows users to see what additional concepts are most associated with a selected term by displaying closely related terms in several ways. In Figure 1, the Narratives display compares the fortunes of four presidential primary competitors over the first three months of 2008. The number of references to each candidate’s name is shown as a line on the graph; the lines share axes, and so can be compared. The contribution of this paper is to show a way to piece together this complex of information. By viewing each response to a news story as a single event with multiple keywords, we can visualize the sequences of keywords as a series of simple (but related) line graphs. Unlike much past research, which has largely emphasized a single variable changing over time, our particular challenge is to examine multiple possibly-related variables. We wish to both examine the continuity of themes over time, and also find correlations between themes. 2 RELATED APPROACHES There have been a variety of approaches to looking at how ideas evolve over time. The information retrieval topic of topic detection and tracking, for instance, looks at how discussions of topics change. We examine the topic detection and tracking * {danyelf, aaronho, ggr, mhurst}@microsoft.com Figure 1. Narratives, showing daily references to four US presidential candidates from January 1 – March 26, 2008. Time passes along the x axis for each candidate; number of mentions of the term along the y. Note that Huckabee (orange) falls off as his campaign ends.