GENERATING AND BROWSING MULTIPLE TAXONOMIES 191 Journal of Management Information Systems / Spring 2003, Vol. 19, No. 4, pp. 191–212. © 2003 M.E. Sharpe, Inc. 0742–1222 / 2003 $9.50 + 0.00. Generating and Browsing Multiple Taxonomies Over a Document Collection SCOTT SPANGLER, JEFFREY T. KREULEN,AND JUSTIN LESSLER SCOTT SPANGLER has been doing knowledge base and data mining research for the past 15 years—recently at the IBM Almaden Research Center and previously at the General Motors Technical Center. Since coming to IBM in 1996, he has developed software components for data visualizationand text mining, which are available through the Lotus Discovery Server product and IBM Alphaworks. Mr. Spangler has pub- lished papers at ACM-SIGKDD, Machine Learning, IAAI, ACM Hypertext, and HICSS. He holds five patents and has several more patents pending. Scott Spangler holds a B.S. in Math from MIT and an M.A. in Computer Science from the University of Texas. JEFFREY T. KREULEN is a manager at the IBM Almaden Research Center. He holds a B.S. in applied mathematics (computer science) from Carnegie Mellon University, and an M.S. in electrical engineering and a Ph.D. in computer engineering from the Pennsylvania State University. Since joining IBM in 1992, he has worked on multi- processor systems design and verification, operating systems, systems management, Web-based service delivery, and integrated text and data analysis. JUSTIN LESSLER started his career with IBM after graduating the University of North Carolina at Chapel Hill in 1996 and has been with the IBM Almaden Research Center since 1999. ABSTRACT: We present a novel system and methodology for generating and then browsing multiple taxonomies over a document collection. Taxonomies are gener- ated using a broad set of capabilities, including meta data, key word queries, and automated clustering techniques that serve as a seed taxonomy.The taxonomy editor, eClassifier, provides powerful tools to visualize and edit each taxonomy to make it reflective of the desired theme. Cluster validation tools allow the editor to verify that documents received in the future can be automatically classified into each taxonomy with sufficiently high accuracy. In general, those seeking knowledge from a document collection may have only a vague notion of exactly what they are attempting to understand, and would like to explore related topics and concepts rather than simply being given a set of docu- ments. For this purpose, we have developedMindMap,an interface utilizingmultiple taxonomies and the ability to interact with a document collection. KEY WORDS AND PHRASES: data mining, document classification, document clustering techniques,knowledge management, navigation,taxonomy, text mining, visualization.