Score Comparability of Short Forms and Computerized
Adaptive Testing: Simulation Study With the Activity Measure
for Post-Acute Care
Stephen M. Haley, PhD, PT, Wendy J. Coster, PhD, OTR, Patricia L. Andres, MS, PT, Mark Kosinski, MA,
Pengsheng Ni, MD, MPH
ABSTRACT. Haley SM, Coster WJ, Andres PL, Kosinski
M, Ni P. Score comparability of short forms and computerized
adaptive testing: simulation study with the Activity Measure
for Post-Acute Care. Arch Phys Med Rehabil 2004;85:661-6.
Objective: To compare simulated short-form and comput-
erized adaptive testing (CAT) scores to scores obtained from
complete item sets for each of the 3 domains of the Activity
Measure for Post-Acute Care (AM-PAC).
Design: Prospective study.
Setting: Six postacute health care networks in the greater
Boston metropolitan area, including inpatient acute rehabilita-
tion, transitional care units, home care, and outpatient services.
Participants: A convenience sample of 485 adult volunteers
who were receiving skilled rehabilitation services.
Interventions: Not applicable.
Main Outcome Measures: Inpatient and community-based
short forms and CAT applications were developed for each of
3 activity domains (physical & mobility, personal care &
instrumental, applied cognition) using item pools constructed
from new items and items from existing postacute care instru-
ments.
Results: Simulated CAT scores correlated highly with score
estimates from the total item pool in each domain (4- and
6-item CAT r range, .90 –.95; 10-item CAT r range, .96 –.98).
Scores on the 10-item short forms constructed for inpatient and
community settings also provided good estimates of the AM-
PAC item pool scores for the physical & movement and per-
sonal care & instrumental domains, but were less consistent in
the applied cognition domain. Confidence intervals around
individual scores were greater in the short forms than for the
CATs.
Conclusions: Accurate scoring estimates for AM-PAC do-
mains can be obtained with either the setting-specific short
forms or the CATs. The strong relationship between CAT and
item pool scores can be attributed to the CAT’s ability to select
specific items to match individual responses. The CAT may
have additional advantages over short forms in practicality,
efficiency, and the potential for providing more precise scoring
estimates for individuals.
Key Words: Activities of daily living; Outcomes research;
Rehabilitation.
© 2004 by the American Congress of Rehabilitation Medi-
cine and the American Academy of Physical Medicine and
Rehabilitation
A
S PATIENTS RECOVER from illness or injury, an as-
sessment system to measure a continuously changing rep-
ertoire of functional skills is needed throughout the continuum
of postacute care services. Despite mounting interest, no sys-
tem has emerged that can effectively measure functional out-
comes across settings.
1,2
We have highlighted 3 problems that
currently plague outcome measurement in postacute care set-
tings: limited breadth, poor precision, and lack of feasibility.
3
Measurement precision is optimal when the content of func-
tional items and the patients’ abilities are closely matched.
However, in heterogeneous groups, such as are seen in post-
acute care services, an optimal set of items that fits most
patients in a particular subgroup may not be relevant for all
patients in the larger group. Therefore, any one instrument
developed for a specific setting typically has considerable floor
and ceiling effects when used in other postacute care settings.
4
To make instruments more practical, the range of content is
often compromised, leading to large amounts of measurement
noise at various levels of the scale. However, at the level of
individual patient assessment, precision is required if either
treatment or placement decisions are based on functional
scores. To achieve comprehensiveness and precision within a
fixed-item format, some monitoring systems (eg, the recently
proposed Minimum Data Set–Post Acute Care)
5
are cumber-
some and impractical. Collectively, the lack of breadth, un-
equal precision for all patients, and the limited feasibility of
current systems severely restrict the field’s ability to measure
and analyze rehabilitation progress across the continuum of
postacute care settings.
6-9
Recently, there has been intense interest in the application of
item response theory (IRT) to develop the next generation of
practical and precise instruments for monitoring functional
recovery,
10-12
by overcoming the unremitting breadth, preci-
sion, and practicality challenges. To realize many of the po-
tential measurement advantages of IRT, item pools
13
are de-
veloped that contain high-quality items to tap many levels of
functional abilities. Item pools are often built by equating
functional items from different sources so that they can be
linked to form a comprehensive sample of abilities on a com-
mon, underlying metric. The development of conceptually
valid item pools along meaningful functional dimensions ap-
pears to hold promise for the creation of fixed-length short
forms and computerized adaptive testing (CAT) systems,
13,14
perhaps revolutionizing the manner in which assessments are
administered and scored in clinical practice. However, many
From the Research and Training Center on Measuring Rehabilitation Outcomes,
Center for Rehabilitation Effectiveness, Sargent College of Health and Rehabilitation
Sciences, Boston University, Boston, MA (Haley, Coster, Andres, Ni); and Quality-
Metric Inc, Lincoln, RI (Kosinski).
Supported in part by the National Institute on Disability and Rehabilitation Re-
search (grant no. H133B990005), the National Institute of Child Health and Human
Development (grant no. R01 HD43568), and the Agency for Healthcare Research and
Quality. The contents of this article are solely the responsibility of the authors and do
not necessarily represent the official views of the funders.
No commercial party having a direct financial interest in the results of the research
supporting this article has or will confer a benefit upon the author(s) or upon any
organization with which the author(s) is/are associated.
Reprint requests to Stephen M. Haley, PhD, PT, Research and Training Center on
Measuring Rehabilitation Outcomes, Center for Rehabilitation Effectiveness, Sargent
College of Health and Rehabilitation Sciences, Boston University, 635 Common-
wealth Ave, Boston, MA 02215, e-mail: smhaley@bu.edu.
0003-9993/04/8504-8077$30.00/0
doi:10.1016/j.apmr.2003.08.097
661
Arch Phys Med Rehabil Vol 85, April 2004