Forum on Systems and Complexity in Medicine and Healthcare Complexity and categorical analysis may improve the interpretation of agreement studies using continuous variables Cristina Costa-Santos PhD, 1 João Bernardes MD PhD, 2 Luís Antunes PhD 3 and Diogo Ayres-de-Campos MD PhD 2 1 Professor, Department of Biostatistics and Medical Informatics, Faculty of Medicine, University of Porto, Porto, Portugal and Researcher, Center for Research in Health Technologies and Information Systems, Porto, Portugal 2 Professor, Department of Obstetrics and Gynaecology, Faculty of Medicine and S. João Hospital, University of Porto, Porto, Portugal and Researcher, Institute of Biomedical Engineering, Porto, Portugal 3 Professor, Computer Science Department, Faculty of Science, University of Porto, Porto, Portugal and Researcher, Instituto de Telecomunicações (partially supported by CSI2 PTDC/EIA-CCO/099951/2008), Porto, Portugal Keywords cardiotocography, complexity, observer variation, reproducibility of results, statistical data interpretation Correspondence Dr Cristina Costa-Santos Biostatistics and Medical Informatics Department Faculty of Medicine University of Porto Al. Prof. Hernâni Monteiro 4200-319 Porto Portugal E-mail: csantos@med.up.pt Accepted for publication: 23 March 2011 doi:10.1111/j.1365-2753.2011.01668.x Abstract Rationale Complex clinical scenarios involving a high degree of uncertainty frequently lead to a poor agreement over diagnosis and management. However, inconsistent results can be found with the most widely used measures of agreement for continuous variables – the limits of agreement and the intraclass correlation coefficient. Aims and objectives We aim to improve the interpretation of agreement studies using continues variables. Methods and results Evaluation of agreement may be improved by complexity analysis and by categorization of variables, followed by the use of the proportions of agreement. Conclusions The average never characterizes a complex phenomenon and the methods used to access agreement in continuous variables are based on the mean. For future agreement studies, involving complex continuous variables, we recommend a complexity and categorical analysis. Introduction Reproducibility of measurements is an essential factor for clinical practice and for epidemiological research, but observer disagree- ment and other sources of variability are often found in clinical practice. Disagreement over clinical decisions may have important research, clinical and medico-legal consequences [1]. However, the ideal statistical measure to evaluate agreement has yet to be established, and the use of more than one measure has been pro- posed in the past [1–3]. Even when this solution is adopted, incon- sistent results can be found in assessment of complex continuous variables with the most widely used measures – the limits of agreement (LA) and the intraclass correlation coefficient (ICC) [4]. In a previous study evaluating agreement in prediction of umbilical artery blood pH (UAB pH) and Apgar scores, based on the interpretation of foetal heart rate (FHR) tracings, LA results suggested a fair to good agreement, whereas ICC suggested it was poor to fair [4]. LA results were judged to be more plausible with reality, but their interpretation was less consensual, whereas the opposite occurred with the ICC. Other approaches have been developed for the assessment of agreement in continuous vari- ables, as a complement to ICC and LA [5,6]. In this brief report, we propose the addition of complexity analysis and transformation of continuous variables into categori- cal variables as a way to improve the interpretation of results. For this purpose, a reappraisal of the previously cited study [4] was performed, using a larger sample size. Complexity in clinical decisions A complex system is a collection of individual agents acting in ways that are not totally predictable and whose actions are intercon- nected, so that the action of one part changes the context for other agents [7]. The certainty–agreement diagram, proposed by Plsek Journal of Evaluation in Clinical Practice ISSN 1365-2753 © 2011 Blackwell Publishing Ltd, Journal of Evaluation in Clinical Practice 17 (2011) 511–514 511