IOSR Journal of Computer Engineering (IOSR-JCE) e-ISSN: 2278-0661,p-ISSN: 2278-8727, Volume 19, Issue 2, Ver. IV (Mar.-Apr. 2017), PP 78-82 www.iosrjournals.org DOI: 10.9790/0661-1902047882 www.iosrjournals.org 78 | Page Analysis of Development Factors for Asian Countries using DWM on Big Data Pallavi Varandani, Sharvari Jalit, Mrunmayee Mujumdar, Pradeep Lalwani, Prof. Arthi C I, Prof. Priya R L Department of Computer Science & Engineering Vivekananda Education Society’s Institute of Technology Mumbai, India. Abstract: In today’s world, most of the developing countries are rising to become a developed country. There have been analysis of countries that experienced banking crisis in the past. However, the analysis included only data preparation process and the data mining server application for subgroup discovery induction. This paper proposes a data analytical system to perform the analysis on World Bank Indicators for Asian Countries from 1960 to 2015. The past dataset from the World Bank and other sources can be a source to predict the duration required for a country to be called a developed country. The purpose of the paper is to help the government of a nation to collect information and work on the path for the development more accurately. The analysis can be done using various methodology such as MapReduce, Canopy Clustering and Kmeans. Clustering and Reduction techniques are applied in parallel to enhance the existing technology. The outcome will be the prediction of development factors through analysis of various parameters such as Population, Gross domestic product, Trade and Employment, which affects the growth of developing countries. Keywords: Asian Development, World Bank Indicators, Kmeans, clustering, canopy, big data. I. Introduction Development of a country has become an essential factor to meet the requirements for healthy lifestyle in a country. There are many factors which affect the growth of a country. Thus, many countries are still in the developing phase from decades. There are countries that look similar but provide varied living standards to their citizens. For example, in 2009 a citizen in Burkina Faso earned on average 510 USD[1] in comparison to Japanese citizen that earned 37,870 USD. However, in Burkina Faso 29 percent of the adult population was literate and a new-born baby are expecting to live 53 years but all adults in Japan were literate and a Japanese newborn baby could expect to live 83 years. This scenario concludes that it is not just the economy which plays major role in the development of the nation. Other factors such as Life expectancy, birth rate, mortality rate, population, trade, etc. also have a part in the development of a country. To get a better understanding of the development, the countries can be clustered based on its region and factors of development. Here, only Asian countries are taken into consideration for analyzing the development features. The main Goal of this paper is to propose a data analytical system for developing countries demonstrating the duration required and the main factors hindering their development. The data required for the analysis are obtained from World Bank. The World Development indicators is the open source database provided by the World Bank [14]. The analyzing techniques will include Map Reduce and Clustering algorithms. The Map Reduce method will first reduce the data by grouping the countries as per their region. From these group, clustering algorithm will be applied to the Asian countries only. The clustering algorithm will be performed in two stages: Pre Clustering and clustering. This proposal will help the government of the nation to understand the hindrance for their respective countries. In the following, section I will describe the literature survey of the topic and extended and section II describes the proposed system design. Finally, section III will explain the future scope of the system. II. Literature Survey There have been studies on the world development indicators with respect to banking crises in 2013[2]. However, the analysis has been conducted using the data mining server application for subgroup discovery induction. The result of this analysis concludes five subsets of countries with only banking crises. Among the five subsets, three are known as financial driven types, while the rest two are of socioeconomic problems. Also, there are comparative studies of China and the world’s Information Development. The data set has been compared with the data of all continents instead of individual countries. The evaluated result does not indicate the comparison of China with respect to the Asian Countries [3]. The research conducted states that China has made a steady growth in informatization.