A Self-Consistent Sonication Method to Translate Amino Acid Sequences into Musical Compositions and Application in Protein Design Using Articial Intelligence Chi-Hua Yu, Zhao Qin, Francisco J. Martin-Martinez, and Markus J. Buehler* Laboratory for Atomistic and Molecular Mechanics (LAMM), Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue 1-290, Cambridge, Massachusetts 02139, United States * S Supporting Information ABSTRACT: We report a self-consistent method to translate amino acid sequences into audible sound, use the representa- tion in the musical space to train a neural network, and then apply it to generate protein designs using articial intelligence (AI). The sonication method proposed here uses the normal mode vibrations of the amino acid building blocks of proteins to compute an audible representation of each of the 20 natural amino acids, which is fully dened by the overlay of its respective natural vibrations. The vibrational frequencies are transposed to the audible spectrum following the musical concept of transpositional equivalence, playing or writing music in a way that makes it sound higher or lower in pitch while retaining the relationships between tones or chords played. This transposition method ensures that the relative values of the vibrational frequencies within each amino acid and among dierent amino acids are retained. The characteristic frequency spectrum and sound associated with each of the amino acids represents a type of musical scale that consists of 20 tones, the amino acid scale. To create a playable instrument, each tone associated with the amino acids is assigned to a specic key on a piano roll, which allows us to map the sequence of amino acids in proteins into a musical score. To reect higher-order structural details of proteins, the volume and duration of the notes associated with each amino acid are dened by the secondary structure of proteins, computed using DSSP and thereby introducing musical rhythm. We then train a recurrent neural network based on a large set of musical scores generated by this sonication method and use AI to generate musical compositions, capturing the innate relationships between amino acid sequence and protein structure. We then translate the de novo musical data generated by AI into protein sequences, thereby obtaining de novo protein designs that feature specic design characteristics. We illustrate the approach in several examples that reect the sonication of protein sequences, including multihour audible representations of natural proteins and protein-based musical compositions solely generated by AI. The approach proposed here may provide an avenue for understanding sequence patterns, variations, and mutations and oers an outreach mechanism to explain the signicance of protein sequences. The method may also oer insight into protein folding and understanding the context of the amino acid sequence in dening the secondary and higher-order folded structure of proteins and could hence be used to detect the eects of mutations through sound. KEYWORDS: protein, structural analysis, sonication, articial intelligence, recurrent neural networks, molecular mechanics M aterials and music have been intimately connected throughout centuries of human evolution and civilization. 1-4 Indeed, materials such as wood, animal skin, or metals are the basis for most musical instruments used throughout history. 5,6 Today, we are able to use advanced computing algorithms to blur the boundary between material and sound and use hierarchical representa- tions of materials in distinct spaces such as sound or language to advance design objectives. 2,7-9 The approach proposed here is that the translation of protein material representations into Received: March 20, 2019 Accepted: June 5, 2019 Article www.acsnano.org Cite This: ACS Nano XXXX, XXX, XXX-XXX © XXXX American Chemical Society A DOI: 10.1021/acsnano.9b02180 ACS Nano XXXX, XXX, XXX-XXX Downloaded by NOTTINGHAM TRENT UNIV at 06:15:04:094 on June 26, 2019 from https://pubs.acs.org/doi/10.1021/acsnano.9b02180.