Knowledge Organization for Exploration Amihai Motro and Sylvie Goullioud Department of Information and Software Systems Engineering George Mason University Fairfax, VA 22030-4444 Abstract. To support applications, such as efficient browsing in large knowledge bases and cooperative knowledge discovery in large databases, the concept of rule similarity is essential. In this paper we define such a measure, called distance, and then put in the service of various knowledge exploration processes. Rule distances allow us to place all the available knowledge on a “map”, in which proximity reflects similarity. Users can then browse in the knowledge, by iteratively visiting rules and examin- ing their neighborhoods. In our cooperative knowledge discovery process, users suggest a hypothetical rule that has been observed. If the hypothe- sis is verified, the system may suggest other, even better, rules that hold; if refuted, the system attempts to focus the hypothesis until it holds; in either case, the system may offer other rules that hold and are close to the hypothesis. Our work is within the formal framework of logic databases, and we report on an experimental system that implements our approach. 1 Introduction Much database research today is concerned with systems that incorporate knowl- edge. The prevailing approach is to model each knowledge item with a rule. The repository of rules is often unorganized. If organization is imposed, it is usually a tool for indexing the rules, whose purpose is to locate rules quickly in pro- cesses such as inference or integrity verification. Such organization, however, is unsuitable when different knowledge items need to be related according to their semantic similarity. One possible approach, which we advocate in this paper, is to define a distance between every two knowledge items. Such distance mea- sures how “similar” or “different” these items are. It is then possible to consider questions, such as “what are the knowledge items similar to a given item?”. An application made possible by the availability of distance is knowledge browsing. User interfaces may be developed that will permit users to examine an item of the knowledge, then proceed to examine items that are in its neighbor- hood. Selecting an item from this neighborhood, users may proceed to examine items in this item’s neighborhood, and so on. Another application is a coopera- tive system for discovering new knowledge in databases. In this system, a user This work was supported in part by NSF Grant No. IRI-9007106 and by an ARPA grant, administered by ONR under Grant No. N0014-92-J-4038.