FudanNLP: A Toolkit for Chinese Natural Language Processing with Online Learning Algorithms FudanNLP:一个基于在线学习算法的 中文自然语言处理工具包 Feng Ji, Wenjun Gao, Xipeng Qiu, Xuanjing Huang 计峰, 高文君, 邱锡鹏, 黄萱菁 School of Computer Science,Fudan University 复旦大学计算机科学技术学院 {fengji, 082024008, xpqiu, xjhuang}@fudan.edu.cn Abstract In this paper, we describe a Chinese natural language processing (NLP) toolkit FudanNLP, which we employed in CIPS-PARSEVAL 2009. In all five closed test, our system achieved the above-average ranks and was slightly weaker than the best system. 1 Introduction In the past decade, many people have applied machine learning algorithms to NLP and make great progress. They regard NLP tasks as structured learning problems and propose many algorithms to solve them[2, 10, 9, 6, 1]. However, there are often high-dimension feature space and large data size when we transform NLP tasks to optimization problems. Most of the above methods require large memory and long time in training phrase. So we wish to implement an online learning algorithm to avoid these limitations. Our system is based on an online learning algorithm, Passive-Aggressive[5], which has an efficient update rule derived from an optimization problem and needs relative less time than batch algorithms such as maximum entropy[1], conditional random fields[6]. We firstly classify the tasks in CIPS-PARSEVAL 2009 into different learning problem. There are five different tasks in CIPS-PARSEVAL 2009, which are part-of-speech(POS) tagging, base chunk analysis, functional chunk analysis, event chunk detection and constituent parsing. 1