A Self-Consistent Sonification Method to
Translate Amino Acid Sequences into Musical
Compositions and Application in Protein
Design Using Artificial Intelligence
Chi-Hua Yu, Zhao Qin, Francisco J. Martin-Martinez, and Markus J. Buehler*
Laboratory for Atomistic and Molecular Mechanics (LAMM), Department of Civil and Environmental Engineering, Massachusetts
Institute of Technology, 77 Massachusetts Avenue 1-290, Cambridge, Massachusetts 02139, United States
* S Supporting Information
ABSTRACT: We report a self-consistent method to translate
amino acid sequences into audible sound, use the representa-
tion in the musical space to train a neural network, and then
apply it to generate protein designs using artificial intelligence
(AI). The sonification method proposed here uses the normal
mode vibrations of the amino acid building blocks of proteins
to compute an audible representation of each of the 20 natural
amino acids, which is fully defined by the overlay of its
respective natural vibrations. The vibrational frequencies are
transposed to the audible spectrum following the musical
concept of transpositional equivalence, playing or writing
music in a way that makes it sound higher or lower in pitch
while retaining the relationships between tones or chords
played. This transposition method ensures that the relative
values of the vibrational frequencies within each amino acid and among different amino acids are retained. The
characteristic frequency spectrum and sound associated with each of the amino acids represents a type of musical scale
that consists of 20 tones, the “amino acid scale”. To create a playable instrument, each tone associated with the amino
acids is assigned to a specific key on a piano roll, which allows us to map the sequence of amino acids in proteins into a
musical score. To reflect higher-order structural details of proteins, the volume and duration of the notes associated with
each amino acid are defined by the secondary structure of proteins, computed using DSSP and thereby introducing
musical rhythm. We then train a recurrent neural network based on a large set of musical scores generated by this
sonification method and use AI to generate musical compositions, capturing the innate relationships between amino acid
sequence and protein structure. We then translate the de novo musical data generated by AI into protein sequences,
thereby obtaining de novo protein designs that feature specific design characteristics. We illustrate the approach in several
examples that reflect the sonification of protein sequences, including multihour audible representations of natural
proteins and protein-based musical compositions solely generated by AI. The approach proposed here may provide an
avenue for understanding sequence patterns, variations, and mutations and offers an outreach mechanism to explain the
significance of protein sequences. The method may also offer insight into protein folding and understanding the context
of the amino acid sequence in defining the secondary and higher-order folded structure of proteins and could hence be
used to detect the effects of mutations through sound.
KEYWORDS: protein, structural analysis, sonification, artificial intelligence, recurrent neural networks, molecular mechanics
M
aterials and music have been intimately connected
throughout centuries of human evolution and
civilization.
1-4
Indeed, materials such as wood,
animal skin, or metals are the basis for most musical
instruments used throughout history.
5,6
Today, we are able
to use advanced computing algorithms to blur the boundary
between material and sound and use hierarchical representa-
tions of materials in distinct spaces such as sound or language
to advance design objectives.
2,7-9
The approach proposed here
is that the translation of protein material representations into
Received: March 20, 2019
Accepted: June 5, 2019
Article
www.acsnano.org
Cite This: ACS Nano XXXX, XXX, XXX-XXX
© XXXX American Chemical Society A DOI: 10.1021/acsnano.9b02180
ACS Nano XXXX, XXX, XXX-XXX
Downloaded by NOTTINGHAM TRENT UNIV at 06:15:04:094 on June 26, 2019
from https://pubs.acs.org/doi/10.1021/acsnano.9b02180.