560
Acta Cryst. (1990). A46, 560-567
Direct Methods with Single Isomorphous Replacement Data. I.
Reduction of Systematic Errors
BY W. FUREY JR,* K. CHANDRASEKHAR, F. DYDAt AND M. SAX
Biocrystallography Laboratory, PO Box 12055, Veterans Administration Medical Center, University Drive C,
Pittsburgh, PA 15240, USA and Department of Crystallography, 304 Thaw Hall, University of Pittsburgh,
Pittsburgh, PA 15260, USA
(Received 20 March 1989; accepted 28 February 1990)
Abstract
The direct-methods procedure for single isomorphous
replacement (SIR) data [Hauptman (1982). Acta
Cryst. A38, 289-294], as modified by Fortier, Moore
& Fraser [Acta Cryst. (1985), A41,571-577] has been
implemented and tested with a large number of
known structures. It was found that the modified
procedure greatly reduces the bias toward 'unre-
solved' SIR invariant values associated with estimates
of 0 or 7r, but does not remove it entirely. If the heavy
atoms are not in a centrosymmetric array the centroid
of the distribution of invariant estimates is not cen-
tered on true protein values, but is biased toward
conventional SIR values by up to 15° , thus errors in
the estimates are not random but systematic. When
the heavy atoms are in a centrosymmetric array (or
single heavy-atom site in space group P21), the distri-
bution of estimates is often sharply bimodal, with
peaks centered at both true invariant values and pure
'unresolved' SIR values. Simple procedures are given
which can be applied in both situations to reduce
significantly the bias with no overall loss of accuracy.
An additional correction factor is then described
which can be used to remove nearly all of the bias,
and improve the accuracy as well. The result is that
errors in the corrected invariant estimates are small
in magnitude, but are now also random instead of
systematic. Since the number of estimates greatly
exceeds the number of phases, the remaining random
errors should have little impact in phasing processes.
Introduction
In recent years, theoretical developments in the area
of direct methods as applied to protein crystallogra-
phy have advanced considerably. In particular, a
theory for the integration of direct methods with
single isomorphous replacement (Hauptman, 1982)
looked very promising in that it was possible accu-
rately to identify large numbers of three-phase struc-
ture invariants with values of 0 or rr, even for very
* To whom correspondence should be addressed.
t In partial fulfilment of the Doctor of Philosophy Degree.
0108-7673/90/070560-08503.00
large structures. Other procedures capable of identify-
ing invariants with values of 0 or 7r from single-
isomorphous-replacement data were also developed
(Karle, 1983; Giacovazzo, Cascarano & Zheng, 1988).
Unfortunately, it was shown (Xu, Yang, Furey, Sax,
Rose & Wang, 1984) that invariant values of 0 or rr
are not particularly useful for protein crystallography
since they generally correspond to the heavy-atom
invariants (or heavy-atom invariants plus ~r) of the
included derivative. Any procedure which forces
individual phases to satisfy such invariants therefore
results in producing classical 'unresolved' SIR (single
isomorphous replacement) phases, since the
invariants themselves are actually SIR invariants (e.g.
invariants produced by summing over three SIR
phases). The realization of the correspondenc'~ with
SIR phases prompted a re-examination and
modification of Hauptman's formulation (Fortier,
Moore & Fraser, 1985) resulting in a new procedure
which should be considerably more powerful. With
this modification it is possible accurately to identify
large numbers of invariants with absolute values any-
where in the range 0-Tr, however, only the magnitude
of the angle can be identified (i.e. cosine invariant).
By moving away from 0 and rr values the bias toward
SIR invariants should be diminished and the resulting
estimates should become more useful for the determi-
nation of individual protein phases.
In all previous studies the proposed methods were
tested with error-free data, usually for a single struc-
ture; thus the general applicability has not been
demonstrated. In the current study we have applied
the modified formulation of Fortier, Moore & Fraser
to numerous structures taken from the Protein Data
Bank (Bernstein et al., 1977) to determine whether
the accuracy of the estimates is sensitive to space
group, structure size and heavy-atom substitution
parameters. It was found that although the Fortier
modification greatly reduces the bias towards SIR
invariants, it does not remove it entirely since a
residual bias of up to 15° remains. Several alternative
modifications to the procedure are now reported, all
of which lead to further reductions in the bias towards
SIR, and one which can significantly improve the
accuracy of the estimates as well. With the new
© 1990 International Union of Crystallography