FudanNLP: A Toolkit for Chinese Natural Language Processing with Online Learning Algorithms FudanNLP：一个基于在线学习算法的中文自然语言处理工具包 Feng Ji, Wenjun Gao, Xipeng Qiu, Xuanjing Huang 计峰，高文君，邱锡鹏，黄萱菁 School of Computer Science,Fudan University 复旦大学计算机科学技术学院 {fengji, 082024008, xpqiu, xjhuang}@fudan.edu.cn Abstract In this paper, we describe a Chinese natural language processing (NLP) toolkit FudanNLP, which we employed in CIPS-PARSEVAL 2009. In all ﬁve closed test, our system achieved the above-average ranks and was slightly weaker than the best system. 1 Introduction In the past decade, many people have applied machine learning algorithms to NLP and make great progress. They regard NLP tasks as structured learning problems and propose many algorithms to solve them[2, 10, 9, 6, 1]. However, there are often high-dimension feature space and large data size when we transform NLP tasks to optimization problems. Most of the above methods require large memory and long time in training phrase. So we wish to implement an online learning algorithm to avoid these limitations. Our system is based on an online learning algorithm, Passive-Aggressive[5], which has an eﬀicient update rule derived from an optimization problem and needs relative less time than batch algorithms such as maximum entropy[1], conditional random ﬁelds[6]. We ﬁrstly classify the tasks in CIPS-PARSEVAL 2009 into diﬀerent learning problem. There are ﬁve diﬀerent tasks in CIPS-PARSEVAL 2009, which are part-of-speech(POS) tagging, base chunk analysis, functional chunk analysis, event chunk detection and constituent parsing. 1