391 REVISTA INVESTIGACION OPERACIONAL VOL. 40 , NO. 3, 391-399, 2019 DYNAMIC CUR, AN ALTERNATIVE TO VARIABLE SELECTION IN CUR DECOMPOSITION Greibin Villegas Barahona *1 , Carlos Manuel Martín Barreiro ** ,Nerea González García *** , Sergio Hernández González **** , Mercedes Sánchez Barba *** , María Purificación Galindo Villardón *** *Universidad Estatal a Distancia, Costa Rica ** ESPOL Polytechnic University, ESPOL, FCNM, Guayaquil, Ecuador *** Universidad de Salamanca, España **** Universidad Veracruzana, México ABSTRACT CUR decomposition is one of the matrix decomposition techniques proposed in the literature for the selection of rows and/or columns of a data matrix. Dynamic CUR is proposed as an alternative to the selection criteria of the CUR decomposition based on probabilistic criteria. This alternative tries to fit the most adequate theoretical probability distribution to the empirical distribution of the leverages obtained from the start and based on it, automatically determines not only the individuals and/or variables that need to be selected, but also their numbers. In this way, Dynamic CUR sets itself apart from CUR in the information selection criteria, dynamizing the calculation of the approximation error starting from an optimal initial selection of parameters based on the most adequate probability distribution. Lastly, with the purpose of facilitating the use of this new method in any practical context, the Dynamic CUR algorithm has been developed in C#.NET and R languages. KEYWORDS: Multivariate analysis, Principal component analysis, CUR decomposition, Correlation, Singular Value Decomposition. MSC: 62E17, 62G30, 49M27 RESUMEN La descomposición CUR es una de las técnicas de descomposición matricial propuesta en la literatura para la selección de filas y/o columnas de una matriz de datos. Se propone Dinamic CUR, una alternativa al criterio de selección de la descomposición CUR basada en criterios probabilísticos. Esta propuesta trata de ajustar la distribución de probabilidad teórica más adecuada a la distribución empírica de los puntajes altamente influyentes obtenidos de partida y, a partir de ella, determina de manera automática no sólo los individuos y/o variables a seleccionar sino el número de ellas. Así, DinamicCUR se diferencia de CUR en el criterio de selección de la información, dinamizando el cálculo del error de aproximación a partir de una óptima elección inicial de parámetros en base a la distribución de probabilidad más adecuada. Por último, con el fin de facilitar el uso de este nuevo método en cualquier contexto práctico, se ha desarrollado el algoritmo DinamicCUR en lenguaje C#.NET y R. PALABRAS CLAVE: Análisis Multivariado, Análisis de Componentes Principales, CUR descomposición, Correlación, Descomposición Singular Del Valor. 1. INTRODUCTION Data analysis has evolved in a considerable manner in recent years, going from basic descriptive analysis with few variables to the use of multivariate statistical techniques to work with “Big Data”. Analysis tools must take advantage of all the information the data provides to be able to accurately reproduce reality because the power of individual studies and the understanding of all the phenomena lies in the multivariate vision of the world ): Principal Component Analysis (PCA) (Jolliffe, 2002) came to be from the classical methods to reduce data matrix dimensionality, started by Pearson (1901) who searched for a better fitted subspace and Hotelling (1933) for variance maximization. PCA is the most used multivariate dimension reducing technique used to date. It allows, starting from a p set of related variables, to extract q non-correlated latent variables (known as principal components), with q<<p, and using them to get to know the sample’s behavior, absorbing the most amount of variability possible. 1 gvillegas@uned.ac.cr