American Journal of Intelligent Systems 2014, 4(4): 142-147 DOI: 10.5923/j.ajis.20140404.03 Finding Optimal Value for the Shrinkage Parameter in Ridge Regression via Particle Swarm Optimization Vedide Rezan Uslu 1 , Erol Egrioglu 2,* , Eren Bas 3 1 Department of Statistics, University of Ondokuz Mayis, Samsun, 55139, Turkey 2 Department of Statistics, Marmara University, Istanbul, 34722, Turkey 3 Department of Statistics, Giresun University, Giresun, 28000, Turkey Abstract A multiple regression model has got the standard assumptions. If the data can not satisfy these assumptions some problems which have some serious undesired effects on the parameter estimates arise. One of the problems is called multicollinearity which means that there is a nearly perfect linear relationship between explanatory variables used in a multiple regression model. This undesirable problem is generally solved by using methods such as Ridge regression which gives the biased parameter estimates. Ridge regression shrinks the ordinary least squares estimation vector of regression coefficients towards origin, allowing with a bias but providing a smaller variance. However, the choice of shrinkage parameter k in ridge regression is another serious issue. In this study, a new algorithm based on particle swarm optimization is proposed to find optimal shrinkage parameter. Keywords Ridge regression, Optimal shrinkage parameter, Particle swarm optimization 1. Introduction Linear regression method is a classic statistical method. Linear regression method has a lot of assumptions like other statistical techniques. These assumptions are not realistic in the real world application. These assumptions are checked by statisticians. If they are not suitable for data, advanced statistical techniques are applied to the data. Ridge regression is a kind of advanced statistical technique. When data has multicollinearity problem, ridge regression technique can give a solution for data. In this study, a new ridge regression method is introduced. Consider a linear multiple     X Y (1) where Y is the   1  n vector of observations of the dependent variable, X is the   ' p n  matrix of observations of explanatory variables with full rank p,  is the   1 '  p vector of unknown parameters and  is the   1  n vector of random error, where 1 '   p p and p shows the number of explanatory variables in the model. It is assumed that each random error has zero mean and a constant variance 2  and that they are uncorrelated. * Corresponding author: erole@omu.edu.tr (Erol Egrioglu) Published online at http://journal.sapub.org/ajis Copyright © 2014 Scientific & Academic Publishing. All Rights Reserved Moreover it is assumed that the columns of X should not be in a linear dependency of each other. Let us denote the columns of X as p X X X , , , 2 1  . If there is a relationship 1 0 p j j j tX    (2) for a set of numbers such as p t t t , , , 2 1  , not all zero, the relation is called the multicollinearity problem in multiple regression analysis. The presence of multicollinearity has a number of potential serious effects on the ordinary least squares estimates of the unknown parameters. The most serious one is that it results in the large variances and covariance of the least squares estimates of the regression coefficients. Therefore it implies that different samples taken at the same level of X’s could lead completely different estimates of the model parameters. Multicollinearity can also cause to produce least squares estimates of  ’s which are too large in absolute value. When the columns of X matrix are centered and scaled the matrix X X ' becomes the correlation matrix of the explanatory variables and Y X ' is the vector of the correlation coefficients of the dependent variable with each explanatory variable. If the columns X are orthogonal, X X ' matrix is a unit matrix. In the presence of multicollinearity X X ' becomes ill-conditioned which means that it is nearly singular and the determinant of it is