How reliable and accurate is the AO/OTA comprehensive
classification for adult long-bone fractures?
Terje Meling, MD, Knut Harboe, MD, Cathrine H. Enoksen, MD, Morten Aarflot, MSc,
Astvaldur J. Arthursson, MD, PhD, and Kjetil Søreide, MD, PhD, Stavanger, Norway
BACKGROUND: Reliable classification of fractures is important for treatment allocation and study comparisons. The overall accuracy of scoring
applied to a general population of fractures is little known. This study aimed to investigate the accuracy and reliability of the
comprehensive Arbeitsgemeinschaft fu ¨r Osteosynthesefragen/Orthopedic Trauma Association classification for adult long-
bone fractures and identify factors associated with poor coding agreement.
METHODS: Adults (916 years) with long-bone fractures coded in a Fracture and Dislocation Registry at the Stavanger University Hospital
during the fiscal year 2008 were included. An unblinded reference code dataset was generated for the overall accuracy as-
sessment by two experienced orthopedic trauma surgeons. Blinded analysis of intrarater reliability was performed by rescoring
and of interrater reliability by recoding of a randomly selected fracture sample. Proportion of agreement (PA) and kappa (J)
statistics are presented. Uni- and multivariate logistic regression analyses of factors predicting accuracy were performed.
RESULTS: During the study period, 949 fractures were included and coded by 26 surgeons. For the intrarater analysis, overall agree-
ments were J = 0.67 (95% confidence interval [CI]: 0.64Y0.70) and PA 69%. For interrater assessment, J = 0.67 (95%
CI: 0.62Y0.72) and PA 69%. The accuracy of surgeons’ blinded recoding was J = 0.68 (95% CI: 0.65Y 0.71) and PA
68%. Fracture type, frequency of the fracture, and segment fractured significantly influenced accuracy whereas the coder’s
experience did not.
CONCLUSIONS: Both the reliability and accuracy of the comprehensive Arbeitsgemeinschaft fu ¨r Osteosynthesefragen/Orthopedic Trauma
Association classification for long-bone fractures ranged from substantial to excellent. Variations in coding accuracy seem to
be related more to the fracture itself than the surgeon. (J Trauma Acute Care Surg. 2012;73: 224Y231. Copyright * 2012 by
Lippincott Williams & Wilkins)
LEVEL OF EVIDENCE: Diagnostic study, level I.
KEY WORDS: Fracture; long bone; agreement; validity; registry; classification.
T
he global trauma burden is increasing and patients with
fractures of the long bones make up most of the emergency
capacity of orthopedic departments worldwide.
1,2
To compare
treatment methods and outcomes, a valid fracture classification
system is required. Use of hospital administrative classifications
such as the International Classification of Diseases (ICD-10) and
Nordic Medico-Statistical Committee Classification of Surgical
Procedures (NCSP) fails to provide sufficient stratification of
valuable results regarding treatment options.
3
The Arbeitsge-
meinschaft fu ¨r Osteosynthesefragen/YOrthopedic Trauma As-
sociation (AO/OTA) classification of long-bone fractures was
developed for this purpose and has since received worldwide
acceptance. However, the AO/OTA system has been criticized
for not being subjected to a thorough validation process.
4
During
the past two decades, a considerable number of studies have
been published regarding its validity.
5,6
Although most of these
studies considered the intraobserver and interobserver reliabil-
ity, the accuracy measurements are frequently missing.
5
Most
studies only deal with selected segments of the comprehensive
classification.
7,8
Thus, an overall evaluation of the compre-
hensive classification is scarce and consequently has been
requested by the AO Classification Supervisory Commit-
tee.
6,9,10
Of note, even after creating ideal conditions regard-
ing fracture definitions, imaging quality, and the knowledge
of the raters, observer disagreement frequently occurs.
6
How
these variables contribute to the reliability and accuracy of the
complete classification applied to a fracture registry is not well
known.
The objective of this study was to assess the overall
reliability and accuracy of the complete AO/OTA classifica-
tion of long-bone fractures and explore factors contributing to
disagreement.
MATERIALS AND METHODS
The Stavanger University Hospital covers all aspects of
traumatic and nontraumatic orthopedic surgery of the long-
bone skeleton
11
and serves as the only primary trauma and
emergency care facility for a mixed urban and rural population
of about 317,000 inhabitants in the southwestern part of
Norway.
12
Thus, the study results should be applicable to other
western populations. All inhospital-treated long-bone fractures
ORIGINAL ARTICLE
J Trauma Acute Care Surg
Volume 73, Number 1 224
Submitted: June 6, 2011, Revised: December 2, 2011, Accepted: January 25, 2012,
Published online: May 2, 2012.
From the Department of Orthopedic Surgery (T.M., K.H., C.H.E., A.J.A.), Stavanger
University Hospital, Stavanger, Norway; Norwegian Centre for Movement
Disorders (M.A.), Stavanger University Hospital, Stavanger, Norway; Depart-
ment of Surgery (K.S.), Stavanger University Hospital, Stavanger, Norway; and
Department of Surgical Sciences (K.S.), University of Bergen, Bergen, Norway.
Supported by a grant from the Stavanger Health Trust Research Council.
Address for reprints: Kjetil Søreide, MD, PhD, Department of Surgery, Stavanger
University Hospital, POB 8100, Armauer Hansens vei 20, N-4068 Stavanger,
Norway; e-mail: ksoreide@mac.com.
DOI: 10.1097/TA.0b013e31824cf0ab
Copyright © 2012 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.