Indexing and Access for Digital Libraries and the Internet: Human, Database, and Domain Factors Marcia J. Bates Department of Information Studies, University of California, Los Angeles, Los Angeles, California 90095-1520 E-mail: mjbates@ucla.edu Discussion in the research community and among the general public regarding content indexing (especially subject indexing) and access to digital resources, espe- cially on the Internet, has underutilized research on a variety of factors that are important in the design of such access mechanisms. Some of these factors and issues are reviewed and implications drawn for information system design in the era of electronic access. Specifi- cally the following are discussed: Human factors: Sub- ject searching vs. indexing, multiple terms of access, folk classification, basic-level terms, and folk access; Database factors: Bradford’s Law, vocabulary scalabil- ity, the Resnikoff-Dolby 30:1 Rule; Domain factors: Role of domain in indexing. Introduction Objectives In the current era of digital resources and the Internet, system design for information retrieval has burst onto the stage of the public consciousness in a way never seen before. Almost overnight, people who had never thought about information retrieval are now musing on how to get the best results from their query on a World Wide Web search engine, or from a remote library catalog or digital library resource. At the same time, and under the same stimulus, experts in a variety of fields cognate to informa- tion science—such as cognitive science, computational liguistics, and artificial intelligence—are taking up informa- tion retrieval questions from their perspectives. In the meantime, those of us in information science, where information retrieval from recorded sources (as dis- tinct from mental information retrieval) has long been a core, if not the core, concern, are faced with a unique mix of challenges. Information science has long been a field understaffed with researchers. Where some fields have 10 researchers working on any given question, we have often had one researcher for 10 questions. A promising research result from an information science study may languish for years before a second person takes up the question and builds on the earlier work. (This is not universal in the field; some questions are well studied.) As a consequence of this understaffing, we know a lot about many elements of infor- mation retrieval, but often in a fragmented and underdevel- oped way. At the same time, what we do know is of great value. Years of experience, not only with research but also with application in dozens of information-industry companies, have given information scientists a deep understanding about information retrieval that is missing in the larger world of science and information technology. So at this particular historical juncture in the devel- opment of information retrieval research and practice, I believe there are a number of both research results and experience-based bits of knowledge that information sci- entists need to be reminded of, and non-information scientists need to be informed of—information scientists because the fragmentation and understaffing in our field has made it difficult to see and build on all the relevant elements at any one time, and people outside of informa- tion science because these results are unknown to them, or at least unknown to them in the information retrieval implications. My purpose here is to draw attention to that learning and those research results associated with indexing and access to information, which have been relatively under-utilized by those both inside and outside of information science. These are results that seem to me to have great potential and/or importance in information system design, and for further research. Making such a selection is, of course, a matter of judg- ment. I believe that the material below offers the possibility of enriching design and, when studied further, enriching our Received January 31, 1997; revised August 28, 1997; accepted November 7, 1997. © 1998 John Wiley & Sons, Inc. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE. 49(13):1185–1205, 1998 CCC 0002-8231/98/131185-21