Performance Analysis of a Part of Speech Tagging Task Rada Mihalcea University of North Texas Computer Science Department Denton, TX, 76203-1366 rada@cs.unt.edu Abstract. In this paper, we attempt to make a formal analysis of the performance in automatic part of speech tagging. Lower and upper bounds in tagging precision using existing taggers or their combination are provided. Since we show that with existing taggers, automatic perfect tagging is not possible, we offer two solutions for applications requiring very high precision: (1) a solution involving minimum human interven- tion for a precision of over 98.7%, and (2) a combination of taggers using a memory based learning algorithm that succeeds in reducing the error rate with 11.6% with respect to the best tagger involved. 1 Introduction Part of speech (POS) tagging is one of the few problems in Natural Language Processing (NLP) that may be considered almost solved, in that several solutions have been proposed so far, and were successfully applied in practice. State- of-the-art systems performing POS tagging achieve accuracies of over 93-94%, which may be satisfactory for many NLP applications. However, there are certain applications that require even higher precision, as for example the construction of annotated corpora where the tagging needs to be accurately performed. Two solutions are possible for this type of sensitive applications: (1) manual tagging, which ensures high accuracy, but is highly expensive; and (2) automatic tagging, which may be performed at virtually no cost, but requires means for controlling the quality of the labeling process performed by machine. POS tagging is required by almost any text processing task, e.g. word sense disambiguation, parsing, logical forms and others. Being one of the first pro- cessing steps in any such application, the accuracy of the POS tagger directly impacts the accuracy of any subsequent text processing steps. We investigate in this paper the current state-of-the-art in POS tagging, derive theoretical lower and upper bounds for the accuracy of individual systems or combinations of these systems, and show that with existing taggers perfect POS tagging is not possible (where perfect tagging is considered to be 100% accuracy with respect to manually annotated data). Subsequently, we provide two possible solutions for this problem. First, we show that it is possible to design A. Gelbukh (Ed.): CICLing 2003, LNCS 2588, pp. 158–167, 2003. c Springer-Verlag Berlin Heidelberg 2003