_________ *Corresponding author’s e-mail: suhaida@uum.edu.my ASM Sc. J., 13, 2020 https://doi.org/10.32802/asmscj.2020.sm26(1.11) Measuring the Relationship of Bivariate Data Using Hodges-Lehman Estimator Suhaida Abdullah*, Nur Amira Zakaria, Nor Aishah Ahad, Norhayati Yusof and Sharipah Soaad Syed Yahaya College of Arts and Sciences, Universiti Utara Malaysia, 06010 UUM Sintok, Malaysia The relationship of bivariate data ordinarily measured using correlation coefficient. The most commonly used correlation coefficient is the Pearson correlation coefficient. This coefficient is well-known as the best coefficient for interval or ratio bivariate data with a linear relationship. Even though this coefficient is good under the mentioned condition, it also becomes very sensitive to a small departure from linearity. Usually, this is because of the existence of an outlier. For that reason, this paper provides new robust correlation coefficients which combine the elements of nonparametric technique from the Hodges Lehmann estimator and the parametric technique based on the Pearson correlation coefficient. This paper also introduces different scale estimators such as median and median absolute deviation (MADn) and denoted by rHL(med) and rHL(MADn) respectively. The performance of the proposed correlation coefficients is measured by the coefficient values and these values are also being compared to the Pearson correlation coefficient and several existing robust correlation coefficients. The results show that the Pearson correlation coefficient (r) with no doubt is very good under perfect data condition, but with only 10% outliers, it not only give poor correlation value but turns the direction of the relationship to negative. While the rHL(med) and rHL(MADn) offer the highest coefficient values and these values are robust to the existence of outliers by up to 30%. With very good performance under all data conditions yet simple in the calculation, the rHL(med) and rHL(MADn) is considered a good alternative to the r when need to deal with outliers. Keywords: correlation coefficient; Hodges Lehmann; median; median absolute deviation (MADn) I. INTRODUCTION The correlation coefficient is a known coefficient to measure a relationship between two variables. Pearson correlation coefficient is one of the most commonly used correlation coefficients especially when the variables having a linear relationship, but it becomes poor when the relationship deviates from linearity. This shortcoming is usually handled by using nonparametric correlation coefficients such as Spearmen or Kendal Tau correlation coefficient. These correlation coefficients have not influenced by the presence of the outlier due to the uses of rank in their calculation. However, rank is not the best option to avoid the effect of the outlier because it does not use the original data. As stated by Xu et al., (2016) using rank instead of the original data might lead to the losing of useful information. The Pearson correlation coefficient unable to handle the outlier due to the use of the mean as its location estimator. Mean is known to be very sensitive to the outlier with 0% breakdown point. This drawback encourages the development of a robust correlation coefficient as alternatives to the Pearson correlation coefficient in handling the outlier. The robust correlation coefficient can be a better option compared to the nonparametric because it lessens the influence of the outlier but remains to use the original data. To date, the robust correlation coefficient base on median developed by Sheylyakov et. al., (2012) provided a more reliable measurement of the coefficient. Median is known to have the maximum breakdown point which is 50%. However,