International Journal of Basic & Applied Sciences IJBAS-IJENS Vol:13 No:01 1 1215906-1301-4747- IJBAS-IJENS @ February 2013 IJENS I J E N S Multivariate Outlier Detection in Currency Exchange Rate 1 Shamshuritawati Sharif, 2 Maman. A. Djauhari, 3 Nur Syahidah Yusoffi 1 School of Quantitative Sciences, College Arts and Sciences, Universiti Utara Malaysia, Kedah, Malaysia 2 Department of Mathematical Sciences, Faculty of Science, Universiti Teknologi Malaysia, Johor, Malaysia 3 Faculty of Industrial Sciences & Technology, Universiti Malaysia Pahang, Pahang, Malaysia 1 shamshurita@uum.edu.my, 2 maman@utm.my, 3 syazul88@yahoo.com , 1 (+60194248001) Abstract — In correlation networks analysis, the influential currencies is usually identified by using minimal spanning tree (MST) to filter the important information followed by the centrality measures analysis. In this paper, we introduce an analysis to identify the currencies that might have different behaviour compared to the others which is also conducted based on MST by using outlier labelling and testing. A case study on 78 currency exchange rate will be reported and discussed. Index Term-- correlation matrix; distance matrix; gamma distribution; Kruskal’s Algorithm I. INT RODUCT ION A currency such as Malaysian ringgit, US dollar, British pound sterling, etc., can be used for payments transaction within country. In any country whose residents conduct business abroad or engage in financial transactions with residents in other countries, they need for an exchange of one currency to another currency, so that payments can be made in a form acceptable to foreigners. The exchange rates have direct influence on all other markets because the price of any asset is expressed in terms of a currency [1]. Thus, currency exchange rates have recently been of vital importance to global investment community. In investigating the currency behaviour, a natural starting point is the examination of correlation among of currency. The correlation provides a similarity measure among the behaviour of different elements in the structure. Interestingly, a currency constitutes a complex system [2]. Their co mp le x interrelationships are in terms of price fluctuations, and their number of currencies. Usually, those interrelationships can be represented by the correlation analysis among the logarithmic of stocks price returns. The use of complex systems tools to analyze financial markets is important for a variety of reasons. For the purpose of filtering the important information contained in such complex system, Mantegna introduces the use of a minimum spanning tree (MST) in 1999. Since then, MST is widely used in many areas in financial industry such as currency exchange rate; [1],[3],[4] and [5], while in stock market [6] and [7], and [8] in portfolio analysis. It has been shown that a powerful method to investigate financial systems consists in the extraction of a minimal set of relevant interactions associated with the strongest correlations belonging to the MST [9]. Although the subject has been extensively studied, methodological part of the previous studies had not been considered with a special emphasis on possible outliers of currencies. Identifying the outlier is very important issue in statistical data analysis. The occurrences of the outlier can make the estimation of data matrix inadequate. Outliers can heavily influence skewness, kurtosis and other estimations calculated for dataset. The outliers are probable to occur in dataset with many observations and/or variables. To illustrate the procedure for outlier identification, in this paper, we present an example dealing with 78 currencies exchange rate in order to perform a gamma quantile plot and a gap test based on lengths of the edges in the MST. The rest of paper is organized as follows. In Section II we briefly explain on outlier labelling and outlier testing. Next, we define the methodology of this paper and data preparation in Section III and IV, respectively. Later on, a bit recall on a correlation network analysis is presented in section V. Then, in section VI we discuss the result and analysis. At the end of this paper, we will draw attention to a conclusion. II. OUTLIER L ABELLING AND TESTING The outliers in the dataset are usually assumed as errors or noises of various kinds. Hawkins [10] defines an outlier is an observation which deviates so much from the other observations as to arouse suspicions that it was generated by a different mechanism. In one or two dimensions, outlying data are easily identified from a simple scatter plot. In that case, we can use Chauvenet's criterion, Peirce's criterion, Grubbs' test, or Dixon‟s Q test for testing the outlier. However, the identification is more difficult on higher dimensions but there are many procedures and algorithms have been developed to outlier detection depending on the application and number of observations in the dataset. As example, principle component analysis (PCA), minimum covariance determinant (MCD), Wilk‟s stat itics, etc. See [10], [11], [12] and [13]. Generally, the basis tool for multivariate outlier detection is the Mahalanobis distance. For the details, Hawkins provides a comprehensive text about labelling, accommodation, and identification of outliers. See [10]. As overall, one of useful method for outlier detection is outlier labelling which deals with separating suspects from