Atomic and Electronic Structures of Molecular Crystalline Cellulose I: A First-Principles Investigation Xianghong Qian,* ,† Shi-You Ding, ‡ Mark R. Nimlos, ‡ David K. Johnson, ‡ and Michael E. Himmel ‡ Rx-Innovation, Inc., Fort Collins, Colorado 80525, and National Bioenergy Center, National Renewable Energy Laboratory, Golden, Colorado 80401 Received July 29, 2005; Revised Manuscript Received October 12, 2005 ABSTRACT: A theoretical model based on the competition between hydrogen-bonding energy and strain energy was constructed to explain the size of native cellulose I. The cellodextrins in native crystalline cellulose IR and I are unusually stable compared to other polysaccharides, not easily prone to hydrolysis even though they are only nanometers in diameter. The stability of crystalline cellulose I is most likely due to its greatly enhanced hydrogen-bonding (HB) network. We carried out ab initio calculations to determine the native crystalline cellulose I atomic and conformational structures. For crystalline cellulose, we found that every hydroxyl group in the cellulose structure is hydrogen bonded as both a donor and an acceptor. This agrees well with published X-ray and neutron diffraction data. We also determined the electronic structures and the energetics for one cellodextrin chain, one to four sheets of cellodextrins in cellulose, and the bulk cellulose I. I. Introduction Native crystalline cellulose consists of two phases, IR and I. Both are frequently found to coexist in cell wall structures together with amorphous cellulose. 1-10 Bac- terial and algal celluloses are predominantly of the IR type, whereas higher plants and tunicate celluloses are mostly of the I type. Both cellulose IR and I are metastable and can only be synthesized by the living organisms. 11 Thermodynamically, the cellulose I for- mat is found to be more stable than the IR. 12-14 Cellulose IR was converted to I by annealing at around 200 °C in a number of different solvent media. 12-15 In addition, there exist several mainly synthetic crystalline cellulose allomorphs II, III, and IV, which differ vastly from native cellulose in their atomic conformational structures. 7,16-20 Even though it is well-known that the hydrogen-bonding interaction is the main binding force for maintenance of these molecular crystals, the detailed hydrogen-bonding networks in native crystalline cel- lulose remained elusive until recently. This situation was due primarily to the coexistence of both cellulose IR and I in most plant cell walls and the small size of the microfibrils with which they are associated, typically only several nanometers in width. 4,21-25 As a result of recent synchrotron X-ray and neutron diffraction analy- ses of native cellulose IR and I by Nishiyama and co- workers, 7,8 the basic atomic structures and hydrogen- bonding networks for these cellulose forms are now known experimentally. Cellulose IR was found to have a triclinic P1 structure with one cellobiose chain in each unit cell (a ) 6.717 Å, b ) 5.9962 Å, c ) 10.400 Å, R) 118.08°,  ) 114.80°, and γ ) 80.37°), whereas cellulose I is found to have a monoclinic P2 1 structure with two cellobiose chains in each unit cell (a ) 7.784 Å, b ) 8.201 Å, c ) 10.380 Å, R)  ) 90°, γ ) 96.5°). Both crystalline cellulose IR and I are formed by stacking planar cellulose sheets, but in different ways. The cellulose sheets are in turn composed of linear cellulose chains bounded by the interchain hydrogen-bonding interactions between the O6H (donor) in one chain and O3 (acceptor) in the neighboring chain. Besides interchain hydrogen-bonding interactions, there are intrachain hydrogen bonds be- tween O3H and the ring O5 and between O2H and O6. For cellulose I, the lattice a direction is the cellulose sheet stacking direction and the b direction is perpen- dicular to the chain direction in the sheet plane. The c direction is the chain direction with ∼10.4 Å cellobiose repeating unit length in both IR and I. The two chains as shown in Figure 1 in cellulose I unit cell lying on two neighboring sheets, designated as the origin and center chains respectively, are conformationally differ- ent. The center chain is shifted in the chain direction by 1 / 4 c. In cellulose IR, the chains lying on the two neighboring sheets are also shifted with respect to the c direction by 1 / 4 c. However, these two chains are conformationally identical to each other. The stacking order of cellulose sheets in I is ABAB..., whereas in cellulose IR is ABCABC.... The binding forces between the sheets were thought to be mainly of van der Waals interaction for both IR and I; however, substantial intersheet C-H‚‚‚O hydrogen-bonding interactions were also believed to play a role in the cohesion of these cellulose sheets. 7 The differences between IR and I are very subtle, and both have almost the same density. Cellulose I has a slight higher density than that of IR. It is perhaps important to note that I has a slightly compressed unit length in the c (chain) direction than IR (10.38 Å in I vs 10.40 Å in IR). 7,8 This indicates that one of the native cellulose allomorphs is more stressed than the other and that could affect the stability of their corresponding structures. A recent investigation by Ding and Himmel 26 suggested that the cellulose microfibril in plant cell walls is composed of 36 cellulose chains with a total of six sheets only. Of these 36 chains, only the inner chains are crystalline in nature, whereas the outer ones are noncrystalline. The crystalline part of the inner cellulose of the microfibril is dominated by I † Rx-Innovation, Inc. ‡ National Renewable Energy Laboratory. * Corresponding author. 10580 Macromolecules 2005, 38, 10580-10589 10.1021/ma051683b CCC: $30.25 © 2005 American Chemical Society Published on Web 11/12/2005