Correlation Between Turkish Stock Market and Economy News Sadi Evren SEKER Computer Engineering Dept. Istanbul University academic@sadievrenseker.com Zeki ERDEM Turkish National Science Foundation zeki.erdem@tubitak.gov.tr Nuri OZALP Turkish National Science Foundation nuri.ozalp@tubitak.gov.tr Cihan MERT Electrical Engineering University of Texas at Dallas Cihan.Mert@utdallas.edu Khaled Al-NAAMI Computer Science Department University of Texas at Dallas kma041000@utdallas.edu Latifur KHAN Computer Science Department University of Texas at Dallas lkhan@utdallas.edu Abstract Is the concept of stock market speculations, related with the news in the news papers? This study mainly focus on the correlation between economy news from one of the highest circulation rate news paper in Turkey and Istanbul stock market closing values. Data set is collected from the web page of news paper in natural language and text mining technique, term frequency – inverse document frequency is applied over these news. On the other hand the stock market values are evaluated as a signal processing job and random walk method has been applied on it. The two feature vectors are correlated with several classification algorithms such as support vector machines, k- nearest neighborhood and artificial neural networks. The results show that there is a weakly relation over 43% between the news and stock market closing values. We believe this research would be beneficiary for the literature to create some stock market estimation tools from the economy news or market strength analysis. Categories and Subject Descriptors H.2.8. Databases / Data Mining, J.1. Administrative Data Pro- cessing / Financial General Terms Data Mining, Financial Data Processing, Stock Market Analysis Keywords Data Mining, Big Data, Stock Market Analysis, SVM, ANN, KNN, Random Walk, Text Mining 1. INTRODUCTION This study is built on one of the highly circulating news papers in Turkey which has special pages for economy news. We have col- lected only these economy news which are free from other news like sports or magazine etc. The properties of the data set will be explained in the experiments section. We have processed the news text via text mining approach called term frequency – inverse document frequency (TF-IDF) which will be explained in the methodology section. On the other hand we have processed the stock market closing values by using a signal processing ap- proach, the random walk (RW). Finally we have investigated the correlation between these two feature vectors by using the support vector machines (SVM), k-nearest neighborhood (KNN) and arti- ficial neural networks (ANN) which are discussed in the section of classification. Also this paper holds the implementation details and the methodology of evaluation over the classification results which are held in the evaluation section. 2. PROBLEM STATEMENT This study can be categorized as a correlation study built on text mining and signal processing studies. There are two major feature extraction methodologies implement on two data sets which are economy news and stock market closing values and methos are TF-IDF and RW respectfully. Figure 1. Overview of Study The correlation between news and stock market is one of the indi- cators of the speculative markets[1]. One of the difficulties in this study is the dealing with natural language data source which requires a feature extraction. The other difficulty is the dealing with a stock market value which is