How reliable and accurate is the AO/OTA comprehensive classification for adult long-bone fractures? Terje Meling, MD, Knut Harboe, MD, Cathrine H. Enoksen, MD, Morten Aarflot, MSc, Astvaldur J. Arthursson, MD, PhD, and Kjetil Søreide, MD, PhD, Stavanger, Norway BACKGROUND: Reliable classification of fractures is important for treatment allocation and study comparisons. The overall accuracy of scoring applied to a general population of fractures is little known. This study aimed to investigate the accuracy and reliability of the comprehensive Arbeitsgemeinschaft fu ¨r Osteosynthesefragen/Orthopedic Trauma Association classification for adult long- bone fractures and identify factors associated with poor coding agreement. METHODS: Adults (916 years) with long-bone fractures coded in a Fracture and Dislocation Registry at the Stavanger University Hospital during the fiscal year 2008 were included. An unblinded reference code dataset was generated for the overall accuracy as- sessment by two experienced orthopedic trauma surgeons. Blinded analysis of intrarater reliability was performed by rescoring and of interrater reliability by recoding of a randomly selected fracture sample. Proportion of agreement (PA) and kappa (J) statistics are presented. Uni- and multivariate logistic regression analyses of factors predicting accuracy were performed. RESULTS: During the study period, 949 fractures were included and coded by 26 surgeons. For the intrarater analysis, overall agree- ments were J = 0.67 (95% confidence interval [CI]: 0.64Y0.70) and PA 69%. For interrater assessment, J = 0.67 (95% CI: 0.62Y0.72) and PA 69%. The accuracy of surgeons’ blinded recoding was J = 0.68 (95% CI: 0.65Y 0.71) and PA 68%. Fracture type, frequency of the fracture, and segment fractured significantly influenced accuracy whereas the coder’s experience did not. CONCLUSIONS: Both the reliability and accuracy of the comprehensive Arbeitsgemeinschaft fu ¨r Osteosynthesefragen/Orthopedic Trauma Association classification for long-bone fractures ranged from substantial to excellent. Variations in coding accuracy seem to be related more to the fracture itself than the surgeon. (J Trauma Acute Care Surg. 2012;73: 224Y231. Copyright * 2012 by Lippincott Williams & Wilkins) LEVEL OF EVIDENCE: Diagnostic study, level I. KEY WORDS: Fracture; long bone; agreement; validity; registry; classification. T he global trauma burden is increasing and patients with fractures of the long bones make up most of the emergency capacity of orthopedic departments worldwide. 1,2 To compare treatment methods and outcomes, a valid fracture classification system is required. Use of hospital administrative classifications such as the International Classification of Diseases (ICD-10) and Nordic Medico-Statistical Committee Classification of Surgical Procedures (NCSP) fails to provide sufficient stratification of valuable results regarding treatment options. 3 The Arbeitsge- meinschaft fu ¨r Osteosynthesefragen/YOrthopedic Trauma As- sociation (AO/OTA) classification of long-bone fractures was developed for this purpose and has since received worldwide acceptance. However, the AO/OTA system has been criticized for not being subjected to a thorough validation process. 4 During the past two decades, a considerable number of studies have been published regarding its validity. 5,6 Although most of these studies considered the intraobserver and interobserver reliabil- ity, the accuracy measurements are frequently missing. 5 Most studies only deal with selected segments of the comprehensive classification. 7,8 Thus, an overall evaluation of the compre- hensive classification is scarce and consequently has been requested by the AO Classification Supervisory Commit- tee. 6,9,10 Of note, even after creating ideal conditions regard- ing fracture definitions, imaging quality, and the knowledge of the raters, observer disagreement frequently occurs. 6 How these variables contribute to the reliability and accuracy of the complete classification applied to a fracture registry is not well known. The objective of this study was to assess the overall reliability and accuracy of the complete AO/OTA classifica- tion of long-bone fractures and explore factors contributing to disagreement. MATERIALS AND METHODS The Stavanger University Hospital covers all aspects of traumatic and nontraumatic orthopedic surgery of the long- bone skeleton 11 and serves as the only primary trauma and emergency care facility for a mixed urban and rural population of about 317,000 inhabitants in the southwestern part of Norway. 12 Thus, the study results should be applicable to other western populations. All inhospital-treated long-bone fractures ORIGINAL ARTICLE J Trauma Acute Care Surg Volume 73, Number 1 224 Submitted: June 6, 2011, Revised: December 2, 2011, Accepted: January 25, 2012, Published online: May 2, 2012. From the Department of Orthopedic Surgery (T.M., K.H., C.H.E., A.J.A.), Stavanger University Hospital, Stavanger, Norway; Norwegian Centre for Movement Disorders (M.A.), Stavanger University Hospital, Stavanger, Norway; Depart- ment of Surgery (K.S.), Stavanger University Hospital, Stavanger, Norway; and Department of Surgical Sciences (K.S.), University of Bergen, Bergen, Norway. Supported by a grant from the Stavanger Health Trust Research Council. Address for reprints: Kjetil Søreide, MD, PhD, Department of Surgery, Stavanger University Hospital, POB 8100, Armauer Hansens vei 20, N-4068 Stavanger, Norway; e-mail: ksoreide@mac.com. DOI: 10.1097/TA.0b013e31824cf0ab Copyright © 2012 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.