International Journal of Cardiology 69 (1999) 185–189 How high can a correlation coefficient be? Effects of limited reproducibility of common cardiological measures * Darrel P. Francis , Andrew J.S. Coats, Derek G. Gibson Department of Cardiology, Royal Brompton Hospital, Sydney Street, London SW36NP , UK Received 9 December 1998; received in revised form 1 February 1999; accepted 1 February 1999 Abstract In clinical studies the linear correlation coefficient is commonly used to quantify the strength of the association between two variables, such as height and weight: the value of r indicates whether the relationship is a strong one. However, actual clinical data includes an underlying physical variable plus an inevitable measurement error component that represents the reproducibility of the test used. If test reproducibility is poor, then even if the underlying physical variables are perfectly correlated, the actual observed correlation coefficient cannot be one but must be somewhat less. We present a method for calculating the reduction in correlation coefficient due to limited reproducibility, and discuss its implications with respect to experimental design and interpretation. 1999 Elsevier Science Ireland Ltd. All rights reserved. Keywords: Correlation coefficient; Reproducibility 1. Introduction physical variables x and y are perfectly correlated ( r 51, Fig. 1a), the measured correlation coefficient In clinical studies Pearson’s product-moment cor- ( r 9, shown in Fig. 1b) cannot be one but must be relation coefficient ( r) is commonly used to quantify somewhat less. We describe a method which allows the strength of the linear association between two this effect to be quantified, once the reproducibility of physically independent measured variables ( x and y). the individual measurements is known. High values are frequently quoted, and indeed seem to be sought by journal editors and readers alike. However, real clinical data ( x9 and y9 ) always include 2. Definitions underlying physical variables ( x and y) plus inevitable ¯ measurement error components (e and e respective- x: underlying (or ‘true’) physical variable; x : mean x y ly), which represent the reproducibility of the tests of underlying physical variable; V : variance of x used. underlying physical variable; e : error component due x If these measurement errors are large (test repro- to reproducibility of test; V : variance of error ex ducibility is poor), then even if the underlying component due to reproducibility of test; x9: mea- sured (or ‘observed’) value of x, equal to x 1 e ; V : x x9 variance of measured variable, equal to V 1 V , if the x ex *Corresponding author. Tel.: 144-973-105-394; fax: 144-171-351- error is random with respect to the measured value. 8634 / 8733. ¯ E-mail address: d.francis@rbh.nthames.nhs.uk (D.P. Francis) We define similarly for y, y, V , e , V , y9, and V . y y ey y 9 0167-5273 / 99 / $ – see front matter 1999 Elsevier Science Ireland Ltd. All rights reserved. PII: S0167-5273(99)00028-5