Feature Extraction and Loss training using CRFs: A Project Report ∗ Ankan Saha Department of computer Science University of Chicago March 11, 2008 Abstract POS tagging has been a very important problem in the domain of Natural Language Processing. It has been ap- proached at by diﬀerent tools like Maximum entropy Models [3], Cyclic Dependency Networks [2] and Conditional Random Fields as well. My work mainly revolved around Conditional Random Fields, developing it as a loss function for an opti- mization solver and then training the loss function and using Viterbi algorithm to develop an estimator for the process. 1 Introduction Most Natural Language tasks involve the use of Parts of Speech (POS) tag- ging. Conditional Random Fields are a powerful machine learning tool [1], [4] which have been used for the purpose of POS tagging in NLP with better results than Maximum Entropy Models and other machine learning models. Conditional Random Fields is a framework for building probabilistic tools for segmenting and labeling sequenced data. These are undirected discrim- inative models which have an advantage over HMMs and other generative models because they calculate p(y|x) instead of the joint probability p(y, x). Thus we can include richer and more informative features by using CRFs while remaining oblivious about the nature of p(x) which needs to be oth- erwise known for generative models. Thus CRFs do not need to make any * Done as part of a Summer Internship Project at NICTA Australia, May-July, 2007