Journal of General Virology (2001), 82, 2463–2474. Printed in Great Britain ................................................................................................................................................................................................................................................................................... Immune and artificial selection in the haemagglutinin (H) glycoprotein of measles virus Christopher H. Woelk, 1 Li Jin, 2 Edward C. Holmes 1 and David W. G. Brown 2 1 Department of Zoology, University of Oxford, South Parks Road, Oxford OX1 3PS, UK 2 Enteric, Respiratory and Neurological Virus Laboratory, Central Public Health Laboratory, London NW9 5HT, UK We present a maximum likelihood (ML) analysis of the selection pressures that have shaped the evolution of the large (L) protein and the haemagglutinin (H) glycoprotein of measles virus (MV). A number of amino acid sites that have potentially been subject to adaptive evolution were identified in the H protein using sequences from every known genotype of MV. All but one of these putative positively selected sites reside within the ectodomain of the H protein, where they often show an association with positions of potential B-cell epitopes and sites known to interact with the CD46 receptor. This suggests that MV may be under pressure from the immune system, albeit relatively weakly, to alter sites within epitopes and hence evade the humoral immune response. The positive selection identified at amino acid 546 was shown to correlate with the passage history of MV isolates in Vero cells. We reveal that Vero cell passaging has the potential to introduce an artificial signal of adaptive evolution through selection for changes that increase affinity for the CD46 receptor. Introduction Measles virus (MV) is an enveloped virus (genus Morbillivirus, family Paramyxoviridae) containing a negative- sense () RNA genome of 15894 nucleotides. MV is one of the current eradication targets established by the World Health Organization (WHO) and widespread vaccination programmes have greatly reduced the incidence of measles in the Western hemisphere. Unfortunately, close to 1 million deaths a year still result from MV infection in the developing world and reintroductions into Western countries make MV a significant public health problem. Although MV is considered to be serologically monotypic, measles infections due to the wild- type virus can be classified into several genotypes, which may have distinct geographical origins (World Health Organization, 1998). The biological significance of this diversity is not well understood but since the haemagglutinin (H) and fusion (F) surface glycoproteins induce neutralizing antibody responses, it is possible that sequence differences may reflect immunological pressure (Griffin & Bellini, 1996). A rare complication of measles infection results in subacute sclerosing panencephalitis (SSPE ; 1–5 per million cases), which Author for correspondence : Christopher Woelk. Fax 44 1865 310447. e-mail Christopher.Woelkzoo.ox.ac.uk is similar to measles inclusion body encephalitis (MIBE) in that it develops due to a persistent infection of neural cells. SSPE is a fatal neuro-degenerative disorder whose pathogenesis re- mains poorly understood. The L protein is thought to be the viral polymerase due to its low abundance, large size and localization to trans- criptionally active viral cores. The centre of this protein comprises five regions of high homology, which form an ancestral polymerase fold that is conserved in the RNA- dependent RNA polymerases of other virus families (Lamb & Kolakofsky, 1996). The H protein is thought to interact with two different receptors, CD46 and SLAM (Manchester et al., 2000 ; Tatsuo et al., 2000), and together with the F glycoprotein facilitates MV entry into host cells (Lamb, 1993). The H protein can be divided into three domains ; a cytoplasmic domain, a transmembrane domain and a large ectodomain (Muller et al., 1993). The ectodomain consists of a β-propeller structure projected from the cell surface by two helix-rich stem regions. Six antiparallel β-sheets of four strands each form the propeller such that the fourth strand of each sheet is connected by a loop to the first strand of the next sheet. Two such loops connecting sheets 4 to 5 and 5 to 6 are thought to delineate the CD46 receptor-binding domain (Langedijk et al., 1997). Cysteine interactions allow pairing of amino acids 386 to 394 and 381 to 494, and the mature 78 kDa form of the H protein 0001-7729 2001 SGM CEGD