96. Preserving User Preferences in Document-Category Management: An Ontology-based Evolution Approach Yen-Hsien Lee Department of MIS National Chiayi Univ. Chiayi, Taiwan, ROC yhlee@mail.ncyu.edu.tw Chih-Ping Wei Inst. of Tech. Management National Tsing Hua Univ. Hsinchu, Taiwan, R.O.C. cpwei@mx.nthu.edu.tw Paul Jen-Hwa Hu Acct. and Info. Systems University of Utah, USA actph@business.utah.edu Abstract Preserving the user’s preference in document-category management is essential because it affects his/her search efficiency, cognitive processing load, and satisfaction. Prior research has investigated automated document category evolution by using lexicon-based document- category evolution techniques which take into account the document categories previously created by the user. However, comparing documents at the lexical level cannot solve word mismatch or ambiguity problems effectively. To address such problems inherent to the lexicon-based approach, we propose an ONtology-based Category Evolution (ONCE) technique, which uses an appropriate ontology to support document-category evolution at the conceptual level rather than at the lexical level. Specifically, we develop an Ontology Enrichment (OE) technique for automatic leaning of concept descriptors in the adopted ontology. We empirically evaluate the effectiveness of the proposed ONCE technique, using a lexicon-based document-category evolution technique (i.e., CE2) and the hierarchical agglomerative clustering (HAC) technique for benchmark purposes. According to our empirical results, ONCE appears more effective than CE2 and HAC, and achieves higher clustering recall and precision. Keywords: Document-category management, Ontology-based category evolution, Category evolution, Concept descriptor learning, Ontology enrichment Introduction The advances and proliferation of information technology have fostered rapid creation and dissemination of information, typically in the form of textual documents, on a massive scale. Analysis of the current practices suggests the common use of document category by individuals and organizations to support users’ information search in the ever-increasing corpora. As new documents arrive over time and are assigned to the previously created categories, the appropriateness or cohesiveness of the existing categories may deteriorate because the new documents bring about significant changes in the category contents and therefore adversely affect category coherence and distinction. Understandably, this necessitates category re-organization and may require new category creation. New documents often arrive in great frequency and enormous quantity and therefore make document category management increasingly challenging. When not properly managed, document categories will evolve in an ad hoc manner, and document assignments can become inconsistent.