Educational Technology & Society 4(3) 2001 ISSN 1436-4522 45 SourceFinder: Course preparation via linguistically targeted web search Irvin R. Katz and Malcolm I. Bauer Educational Testing Service Princeton, NJ 08541 USA Tel: +1 609 734 5150 Fax: +1 609 734 1090 ikatz@ets.org mbauer@ets.org ABSTRACT The use of the Internet for course preparation is ill served by traditional, content-based search engines. This paper describes SourceFinder, a web search engine that locates text material based on linguistic characteristics, such as reading level. Combining SourceFinder with content-based searches may allow instructors more easily to identify material relevant for their courses. We present examples from the teaching of reading comprehension, history, and statistics of how SourceFinder might aid instructors’ use of the Internet for course preparation. Keywords Teaching, Text analysis, Internet search, Linguistic features, Course preparation The Problem Course preparation is a difficult, time-consuming process. It is not unusual to hear of new assistant professors spending a day or more of preparation for each hour of class time. At the elementary and secondary school level, even more experienced teachers report spending approximately 5 hours per week in curriculum or lesson planning for their 15 hours per week of actual teaching (TIMSS, 1995). Course preparation can be challenging for experienced instructors because of the need to keep their instructional material current with the field they are teaching and relevant to students’ everyday lives. In the past few years, instructors have turned to the Internet to aid course preparation. Instead of looking through published textbooks or reviewing test banks produced by publishers, the Internet offers a wider variety of sample instructional material as well as real-world material that can be adapted for instructional use. Thus, to construct lecture notes, tests, and other instructional material, teachers might conduct web searches to find raw materials. While this use of the Internet by instructors is more prevalent, it is ill served by current content-based web search engines, which rely on the information retrieval skills of the instructor. Furthermore, content-based web searches typically return a virtual mountain of information, resulting in the “information overload” problem (Nielsen, 1995): how to find the material most relevant to your need without being overwhelmed by the irrelevant (but related) information generated through content-based (keyword) searches. Sites such as the WWW Virtual Library are a helpful alternative to general content-based web searches, but are limited to material already identified and organized by others. Furthermore, while such sites might have material in the correct content area, the material is of uneven quality and would take some time for an instructor to even identify instructional material of an appropriate difficulty level for his or her students. This paper describes SourceFinder, a domain-independent search engine that locates text material based on linguistic characteristics, rather than purely content. Instead of searching based on content, SourceFinder searches a website (following links to a user-specified depth) to locate passages of text that meet characteristics such as a particular reading level, a certain density of argument, and an internally coherent clarity of expression (i.e., the passage can be understood with minimal background knowledge). Such factors cut across content categories, which is needed for applications including (a) constructivist learning approaches that emphasize real- world contexts for student learning activities and assessments and (b) assessments of linguistic competence, which require challenging prompts that nevertheless do not allow content knowledge to help or hinder performance. SourceFinder searches can be combined with more traditional content-based searches by asking SourceFinder to follow the links resulting from a web search engine, such as Yahoo! ® or Google . The software framework and general approach of SourceFinder has direct applications to preparation for classroom instruction. For example, a writing instructor might use SourceFinder to locate material to serve as