© 2006 Nature Publishing Group Ab initio determination of solid-state nanostructure P. Juha ´s 1 , D. M. Cherba 2 , P. M. Duxbury 1 , W. F. Punch 2 & S. J. L. Billinge 1 Advances in materials science and molecular biology followed rapidly from the ability to characterize atomic structure using single crystals 1–4 . Structure determination is more difficult if single crystals are not available 5 . Many complex inorganic materials that are of interest in nanotechnology have no periodic long-range order and so their structures cannot be solved using crystallographic methods 6 . Here we demonstrate that ab initio structure solution of these nanostructured materials is feasible using diffraction data in combination with distance geometry methods. Precise, sub-a ˚ngstro ¨m resolution distance data are experimentally available from the atomic pair distribution func- tion (PDF) 6,7 . Current PDF analysis consists of structure refine- ment from reasonable initial structure guesses 6,7 and it is not clear, a priori, that sufficient information exists in the PDF to obtain a unique structural solution. Here we present and validate two algorithms for structure reconstruction from precise unassigned interatomic distances for a range of clusters. We then apply the algorithms to find a unique, ab initio, structural solution for C 60 from PDF data alone. This opens the door to sub- a ˚ngstro ¨m resolution structure solution of nanomaterials, even when crystallographic methods fail. Powerful direct imaging methods, such as scanning tunnelling microscopy, transmission electron microscopy and, more recently, lensless imaging 8 , are available to characterize the structure of nanomaterials; however, they do not yield the high precision three- dimensional structural information traditionally obtained using crystallographic methods. The effort towards high accuracy structure determination is driven by the fact that even small changes in interatomic bond lengths can have a marked effect on the properties of solid state materials. For example, the key polaron distortion in giant magnetoresistive materials is of the order of one-tenth of an a ˚ngstro ¨m 9 . Extended X-ray absorption fine structure analysis yields high precision values for the local environment of atoms in nano- particles 10 but not a complete structure. Nuclear magnetic resonance (NMR) in combination with distance geometry methods is critical to structure solution of proteins 11 , particularly in the absence of protein single crystals. However, nuclear Overhauser effect distances used in protein NMR analysis have low resolution, with uncertainties of the order of one a ˚ngstro ¨m 12 . The distance lists extracted from PDF data of nanostructured solids have high resolution, with uncertainties of the order of a few hundredths of an a ˚ngstro ¨m in the atomic separations. However, despite PDFs of materials being measured for almost 75 years (ref. 7), ab initio structure solution from such data has not been previously demonstrated. Here we present and validate several algorithms for structure solution from such high precision, but unassigned, distance lists. The PDF method was traditionally applied to the study of glasses and liquids 13 but more recently has also successfully yielded infor- mation about atomic-scale structures of nanosized materials 6,10,14,15 . For example, the structure of ZnS nanoparticles was found to be significantly modified from the expected sphalerite structure that had been inferred from transmission electron microscopy observations 14 . Another important area of PDF application is nanostructured materials that have nanoscale inhomogeneities within a bulk matrix 6 . Atomic arrangements in these materials are well ordered locally, but are not long-range ordered and cannot be solved using crystal- lographic methods. PDF data are readily obtained using neutron and X-ray powder diffraction measurements, where area X-ray detectors allow remarkably rapid data acquisition 16 . Previously, analysis of PDF data has relied on known starting models 14 or good structural analogues, and has used a trial-and-error approach 6,17 , which is often a laborious process. Alternative methods such as reverse Monte Carlo 18 , empirical potential structure refinement 19 and experimentally constrained molecular relaxation 20 are successful on highly disordered materials and provide a pool of candidate struc- tures consistent with the data, but have not been used to reconstruct the structures of well ordered nanomaterials. The PDF data from a single element system contains a simple unsorted list of the atomic distances present in the cluster without any orientational or three-body information. Reconstruction of structure from noisy or incomplete distances is computationally hard 21,22 even when assignment of lengths to atom pairs is available, as is usually the case in protein structure solution using NMR. The distances extracted from PDF data are much more precise; however, the lengths are unassigned as the pair of atoms contributing to each distance is not known. Nevertheless, we find that a unique and efficient structure solution is possible from unassigned ideal dis- tances for a wide range of clusters, including platonic solids, finite lattices of different symmetry, the C 60 ‘buckyball’ and Lennard-Jones minimum-energy clusters 23,24 . More remarkably, we found that ab initio structure determination is also possible using distances extracted from experimental neutron PDF data for fullerenes. The n-atom Lennard-Jones (LJ-n) cluster is the ground-state configuration of n atoms assuming a Lennard-Jones pair potential acting between all the atoms, and is a standard benchmark system for new optimization methods 23–25 . We have used the interatomic dis- tances occurring in these structures as the target distances for testing various distance geometry algorithms. The cost function that we optimize is the variance between the model distances and the target distances, namely varðdÞ¼ 1 Np P Np k¼1 ðd m k 2 d e lðkÞ Þ 2 , where N p ¼ NðN 2 1Þ=2 is the number of atom pairs in the cluster, d k is the interatomic distance of atom pair k, while the suffix m indicates the model and the suffix e indicates the experimental or target value. When var(d) ¼ 0, the fit is exact. The most difficult computational aspect of this problem is correctly assigning the distances between model atom pairs k to target distances l(k). We first tried a simulated annealing approach 26 , which was successful in finding the correct small clusters from unassigned distance data. However, this method failed for anything more complicated than a 20-atom cluster. This is presumably due to the rugged topology of the potential (var(d)) surface. Genetic or evolutionary algorithms have been very successful in finding the ground state of many types of clusters using theoretical interatomic potentials 23,25,27 . Based on these papers, we have developed LETTERS 1 Department of Physics and Astronomy, 2 Department of Computer Science and Engineering, Michigan State University, East Lansing, Michigan 48824, USA. Vol 440|30 March 2006|doi:10.1038/nature04556 655