  Abstract. Classification systems for research publications are often based on taxonomies. The ACM society for computing and professionals provides a digital library whose cataloguing system is based on a taxonomy that has been continuously updated over the years. The CiteSeer collection contains a large collection of computer science research papers, many of which are tagged with categories from the ACM’s taxonomy. By analyzing the small portion of CiteSeer’s manually tagged documents and by considering different time frames, we extracted statistics that shows how the ACM’s taxonomy covers the publications in computer and information science research sub-fields. We also studied size and growth of categories over the last four available years. These results allow us to reveal areas with higher or lower publication rate. We believe that these techniques could be exploited to quickly identify trends within taxonomies. This would greatly help to construct more efficient browsing and searching systems. Keywords: Classification systems, ACM taxonomy, CiteSeer digital library. I. INTRODUCTION LASSIFICATION systems for research publications must be continuously improved and adapted to reflect current research activities and trends. Because of its rapid changes, this is especially true for the computer and information sciences field. New areas of research continue to emerge while research in other areas falls off. The identification of a society’s interests is important not only to understand research trends, but also to analyze the usage of the adopted classification system, verifying that it properly covers all research areas and that all categories are used as evenly as possible. The distribution of publications is important to develop and maintain efficient searching and browsing systems. In addition, by identifying “hot” and “cold” publication areas, we can identify areas for improvement in the classification system. The information is also of general interest since it provides a summary of current research activities in the field. ACM 1 is the world’s oldest and largest educational and scientific computing society. Its numerous conferences This work was partially supported by the National Science Foundation under grant NSF CRI 0454121 (Next Generation CiteSeer). 1 ACM: Association for Computer Machinery, http://www.acm.org, last visited: November 2009. and journals are the most popular choices for researchers to publish their work. The ACM’s Computer Classification System (CCS), first developed in 1964, is a taxonomy of computer and information science areas that is widely used. Researchers use the categories in this taxonomy to classify their work, and the IEEE Computer Society 2 has also adopted it as the basis for its own taxonomy. In order to study the fit between the ACM’s taxonomy and published research over time, we studied the documents contained in the CiteSeer [16] collection, an automated digital library for scholarly computing-related literature. Our snapshot of this collection contains the 4,348 documents published between 1980 and 2005 that were explicitly tagged by their authors. We processed these documents to study changes and analyze trends in the use of ACM’s taxonomy. By analyzing the use of the ACM’s CCS over the CiteSeer collection, we can find out how well the taxonomy represents the current research trends in the Computer & Information Science & Engineering (CISE) community. This analysis will show the usage of the taxonomy and will reveal research trends that can be used to assist future revision of the ACM’s CCS. We believe that the structure of classification systems will become more and more complex over the years. Making changes to the taxonomy will be increasingly difficult because any changes will have to take the historical integrity into consideration while simultaneously adapting the taxonomy to reflect current research trends. For instance if we consider the CCS taxonomy and the different types of changes (e.g., introduction of cross references) applied during the last revision in 1998 and described in section 3, the structure will become more similar to an ontology with multiple, non-hierarchic links rather than a simple hierarchical taxonomy. For this reason tools should be implemented to conduct periodic and automatic analyses such as the one reported in this study. Further investigations could be conducted if we considered this taxonomy as an ontology and we applied the state of the art techniques to maintain and evolve ontologies. In Section 2, we begin by discussing some trend analysis techniques used in previous studies. Section 3 then describes the ACM CCS taxonomy and the CiteSeer data collection in more detail. In section 4 we introduce 2 IEEE Computer Society, http://www.computer.org/, last visited: November 2009. Using CiteSeer to Analyze Trends in the ACM’s Computing Classification System Mirco Speretta 1 , Susan Gauch 2 , and Praveen Lakkaraju 3 1,2 University of Arkansas, Fayetteville, AR, USA, 3 University of Kansas, Lawrence, KS, USA msperett@uark.edu, sgauch@uark.edu, lakkaraju.praveen@gmail.com C 571 978-1-4244-7562-9/10/$26.00 ©2010 IEEE