Computational Linguistics in the Netherlands Journal 7 (2017) 3-16 Submitted 05/2017; Published 12/2017 Evaluation of Named Entity Recognition in Dutch online criminal complaints Marijn Schraagen M.P.Schraagen@uu.nl Matthieu Brinkhuis M.J.S.Brinkhuis@uu.nl Floris Bex F.J.Bex@uu.nl Utrecht University, The Netherlands Abstract The possibility for citizens to submit crime reports and criminal complaints online is becoming ever more common, especially for cyber- and internet-related crimes such as phishing and online trade fraud. Such user-submitted crime reports contain references to entities of interest, such as the complainant, counterparty, items being traded, and locations. Using named entity recognition (NER) algorithms these entities can be identified and used in further information extraction and legal reasoning. This paper describes an evaluation of the de facto standard NER algorithm for Dutch on crime reports provided by the Dutch police. An analysis of confusion in entity type assignment and recall errors is presented, as well as suggestions for performance improvement. Besides traditional evaluation based on a manually created gold standard, an alternative assess- ment method is performed to allow for more efficient evaluation and error analysis. The paper concludes with a general discussion on the use of NER in information extraction. 1. Introduction Named-entity recognition (NER) is the task of automatically recognising and classifying names that refer to some entity in a text. NER started out as a subtask in the MUC-6 Message Understanding Conference (Grishman and Sundheim 1996), and has since become a standard task in the areas of natural language processing and information retrieval. NER looks for ‘unique identifiers of referents in reality’, such as persons (Dwight Eisenhower ), locations (Amsterdam), companies (Google ) or products (iPhone ). Very often, NER is partly domain dependent; for example, in the biomedical domain it is desired that the names of genes are correctly classified, and in the context of cyber crime we want to identify email addresses and usernames. In our project ‘Intelligence Application for Cybercrime’ (Bex et al. 2016), we are developing an intake system for the Dutch police that automatically processes criminal complaints regarding cases of online fraud, such as fake webshops and malicious second-hand traders. Every year there are about 40,000 such complaints filed online, and the high volume and relatively low damages of such cases makes them ideal for further automated processing. The system consists of a dialogue interface that asks the complainant questions about the fraud case (e.g. ‘What happened’ or ‘Which product did you try to buy?’). Because the complainant can answer using free text input, we need to be able to extract the entities (e.g. fraudsters, email addresses, products) so that the correct questions can be asked. Early approaches to NER were very much rule-based, often combined with gazetteers in which specific entities are listed (Nadeau and Sekine 2007). The problem of this approach is that it involves a lot of manual work, and that rules and lists of entities do not transfer to other domains. Newer approaches typically use supervised machine learning, and for English news texts the task is as good as solved, with F-scores for algorithms close to human scores (around 94%, (Zhou and Su 2002)). However, for these approaches it is also the case that they do not transfer well to other domains or texts which are stylistically and grammatically of lesser quality than news texts, such as email or c 2017 Marijn Schraagen, Matthieu Brinkhuis and Floris Bex.