Quantitative Analysis of Art Market Using Ontologies, Named Entity Recognition and Machine Learning: A Case Study Dominik Filipiak 1 , Henning Agt-Rickauer 2 , Christian Hentschel 2 , Agata Filipowska 1 , Harald Sack 2 1 Department of Information Systems, Poznań University of Economics, Al. Niepodległości 10, 61-875, Poznań, Poland {dominik.filipiak,agata.filipowska}@kie.ue.poznan.pl, WWW home page: http://kie.ue.poznan.pl 2 Hasso-Plattner-Institut Prof.-Dr.-Helmert-Straße 2-3 14482 Potsdam, Germany {henning.agt-rickauer,christian.hentschel,harald.sack}@hpi.de, WWW home page: http://hpi.de Abstract. In the paper we investigate new approaches to quantitative art market research, such as statistical analysis and building of market indices. An ontology has been designed to describe art market data in a unified way. To ensure the quality of information in the knowledge base of the ontology, data enrichment techniques such as named entity recog- nition (NER) or data linking are also involved. By using techniques from computer vision and machine learning, we predict a style of a painting. This paper comes with a case study example being a detailed validation of our approach. Key words: art market, Semantic Web, linked data, machine learning, information retrieval, alternative investment, digital humanities 1 Introduction Due to the constantly growing interest in the alternative investment area, the art market has become a subject of numerous studies. By publishing sales data, many services and auction houses provide a basis for further research in terms of the latest data analysis trends. A closer look at available data shows miss- ing information or inconsistency in many cases, though. An intense effort (see the next section) has been observed among scientists carrying out research on auction markets, especially in the field of index construction. To the best of our knowledge, the problem of data quality has not been raised often in that field. To tackle this issue, we propose mixing standard econometric analysis with the usage of the latest solutions known from the computer science field.