Data Mining of Supersecondary Structure Homology between Light
Chains of Immunogloblins and MHC Molecules: Absence of the
Common Conformational Fragment in the Human IgM Rheumatoid
Factor
Hiroshi Izumi,*
,†
Akihiro, Wakisaka,
†
Laurence A. Nafie,
‡,§
and Rina K. Dukor
§
†
National Institute of Advanced Industrial Science and Technology (AIST), AIST Tsukuba West, 16-1 Onogawa, Tsukuba, Ibaraki
305-8569, Japan
‡
Department of Chemistry, Syracuse University, Syracuse, New York 13244-4100, United States
§
BioTools, Inc., 17546 SR 710 (Bee Line Hwy) Jupiter, Florida 33458, United States
* S Supporting Information
ABSTRACT: It is shown that fuzzy search and data mining
techniques of supersecondary structure homology for subunits
of proteins using conformational code patterns of α-helix-type
(3β5α4β) and β-sheet-type (6α4β4β) fragments can be used
to extract correlations between fragments of MHC class I
molecules and the light chain of immunoglobulins. The new
method of conformational pattern analysis with fuzzy search of
structural code homology reflects well the shape of main chain
rather than secondary structure in comparison with the DSSP
method. Further, the data mining technique using the
combination of h- and s-fragment patterns can quantify the supersecondary structure homology between any subunits of
proteins with different amino acid sequences. Characteristic fragment patterns (string “shhshss”), which were sandwiched
between two identical amino acid sequences His and Pro, were found in light chains of various types of immunogloblins, α-chain
and β-2 microglobulin of MHC class I and α-chain and β-chain of MHC class II, but not in heavy chains of Fab immunoglobulin
fragments and T cell receptors (TCR). Leukocyte immunoglobulin-like receptors (LILR) are related by the conformational
fragment (string “shhshss”) to β-2 microglobulins as a type of pair forms (string “sohsss”). Further, human IgM rheumatoid
factor, one of the immunogloblins, did not strongly exhibit the conformational fragment pattern. Nonclassic MHC class I
molecules CD1D, MIC-A, and MIC-B, which have functions to activate NKT, NK, and T cells, did not also clearly show the
patterns. These code-driven mining techniques can be utilized as a metadata-generating tool for systems biology to elucidate the
biological function of such conformational fragments of MHC I and II molecules, which come in contact with various signal
ligands on the surface of T cells and natural killer cells.
■
INTRODUCTION
Major histocompatibility complex (MHC) classes I
1,2
and II
3
molecules are the key proteins for organism self-recognition
and have polymorphisms to defend against a great diversity of
microbes. For example, natural killer (NK) cells can recognize
and kill tumor cells lacking “self” markers, such as MHC class I,
but the basis for this recognition is not completely understood.
2
Several common autoimmune diseases such as rheumatoid
arthritis are deeply related to MHC class II and other immune
modulators.
3
The polymorphisms of amino acid sequences and molecular
structures for MHC molecules and immunogloblins are
confusing and make the analysis of structural homology and
change using the amino acid sequences very difficult. Further,
no effective method to compare with supersecondary structure
homology of many proteins currently exists. Therefore, we have
developed data mining techniques based on backbone
conformations to analyze the supersecondary structure
homology of proteins with different amino acid sequences.
Previously, we have proposed a conformational code for the
description of conformations of all kinds of chemical
compounds based on structural analysis using vibrational
circular dichroism (VCD) of chiral bioactive compounds.
4-7
The conformational code consists of the combination of the
codes of regional angle locations and conformational elements
(Figure 1), and the conformational elements representing the
classification of dihedral angles are substituted for the symbols
indicating the bond locations (alphabets of angle locations).
6
For example, the conformational elements 1, 2, 3, 4, 5, and 6
correspond to the conformational terms, T (trans), G
+
(+gauche), G
-
(-gauche), sp (synperiplanar), +ac (+anticlinal),
Received: September 3, 2012
Published: February 10, 2013
Article
pubs.acs.org/jcim
© 2013 American Chemical Society 584 dx.doi.org/10.1021/ci300420d | J. Chem. Inf. Model. 2013, 53, 584-591