CREATING INVARIANT SUBSCALES OF THE GHQ-30 JOYCE WHITTINGTON* and FELICIA A. HUPPERT Department of Psychiatry, University of Cambridge, Box 189, Addenbrooke's Hospital, Cambridge CB2 2QQ, U.K. AbstractÐBackground: The GHQ-30 clearly contains more information than the single derived score. Attempts to tap this information using factor analyses of the scale items have largely been abandoned because the factors extracted depend on the population sampled. Method: It is ®rst shown that factor analysis of the GHQ-30 for a given population at a given time is remarkably stable across subsets of the population, but not for the same population at dierent times. Dierent psychometric consider- ations are then invoked to de®ne four subscales which are independent of the particular population and time of measurement. These subscales correspond to the four combinations of positive and negative re- sponses to positively- and negatively-worded questions. Results: It is shown that these four subscales have very dierent characteristics in the population as a whole but that each has a stable distribution over time. Subscale pro®les show qualitative dierences between dierent age and sex groups and between dierent groups at high risk of psychiatric disorder. They also have dierent strengths of re- lationships with Neuroticism and with mortality. Conclusions: The four subscales provide more infor- mation than the single derived score. The GHQ-30 embodies a measure of positive mental well-being which is completely overlooked by conventional scoring and usage; this measure is worthy of further in- vestigation. Our ®ndings have implications for the development, use and interpretation of subscales de- rived from questionnaires, such as the GHQ-30, which measure changing states, rather than stable traits, in the individual. # 1998 Elsevier Science Ltd. All rights reserved Key wordsÐGHQ-30, psychiatric symptoms, factor analysis, positive wellbeing INTRODUCTION The general health questionnaire (GHQ) of Goldberg (1972) is probably the most popular instrument for screening psychiatric disorder in patient and community samples. There are four ver- sions of the GHQ in general use, the long 60-item version (GHQ-60) and three versions which are de- rived from the GHQ-60 by omission of items. The GHQ-30 contains fewer questions about somatic symptoms, the GHQ-28 is based on a factor analy- sis of the GHQ-60, and the GHQ-12 is a shorter version of the GHQ-30. The 30 item version, GHQ- 30, is the topic of this paper. Each item asks about the recent experience of a particular symptom, and for the GHQ-30, half of the items are presented positively (agreement indi- cates absence of symptom), e.g. Have you recently been getting out of the house as much as usual?, and half are presented negatively (agreement indi- cates presence of symptom), e.g. Have you recently felt that things were getting on top of you?. Each item allows four response categories which signify: strong agreement, weak agreement, weak disagree- ment, strong disagreement. There are three gener- ally accepted methods of scoring each item: the traditional method scores strong or weak agreement with negatively presented items and strong or weak disagreement with positively presented items (symp- tom present) ``1'' and all other responses (symptom absent) ``0''; the Likert method scores responses from strong ``symptom absent'' to strong ``symptom present'' as 0, 1, 2, 3; and the CGHQ method is similar to the traditional method except for nega- tively worded items, where ``same as usual'' (weak symptom absent) responses are scored 1, in order to detect more chronic symptoms. If the GHQ-30 is used to screen for psychiatric ``caseness'', with thresholds of 4/5, 39/40 and 12/13 respectively for the three scoring methods, there is little to choose between them. On the other hand, if the score is to be used as a dimensional measure, then the CGHQ is to be preferred for its more normal distribution (Goodchild and Duncan-Jones, 1985; Goldberg and Williams, 1988). The widespread use of this questionnaire has led investigators to enquire whether research might be advanced by using methods for analysing raw scores that retain more information than a single overall score. To this end, several researchers have extracted subscales by factor analysis of the 30 items, and thereby obtained information not avail- able in the full-scale scores alone. For example, fac- tor analyses of English and Chinese versions of the questionnaire, administered to English-speaking Chinese, yielded two congruent factors and three factors interpreted as representing the same con- cepts in the dierent language versions (Chan, Soc. Sci. Med. Vol. 46, No. 11, pp. 1429±1440, 1998 # 1998 Elsevier Science Ltd. All rights reserved Printed in Great Britain 0277-9536/98 $19.00 + 0.00 PII: S0277-9536(97)10133-2 *Author for correspondence. 1429