DOI: 10.1007/s10910-005-4525-6
Journal of Mathematical Chemistry Vol. 38, No. 1, July 2005 (© 2005)
An information-theoretic characterization of partitioned
property spaces
Gerald M. Maggiora
∗
College of Pharmacy and BIO5 Institute, University of Arizona, Tucson, AZ 85721, USA
E-mail: maggiora@pharamcy.arizona.edu
Veerabahu Shanmugasundaram
Computer-Assisted Drug Discovery, Pfizer Global Research & Development, Ann Arbor,
MI 48105, USA
Received 29 December 2004; revised 20 January 2005
A methodology, derived by analogy to Shannon’s information-theoretic theory of
communication and utilizing the concept of mutual information, has been developed to
characterize partitioned property spaces. A family of non-intersecting subsets that cover
the “universe” of objects represents a partitioned property space. Each subset is thus an
equivalence class. A partition and it’s associated equivalence classes can be generated
using any one of a number of procedures including hierarchical and non-hierarchical
clustering, direct approaches using rough set methods, and cell-based partitioning, to
name a few. Thus, partitioned property spaces arise in many instances and represent a
very large class of problems. The approach is based on set-valued mappings from equiv-
alence classes in one partition to those in another and provides a coarse-grained means
for comparing property spaces. From these mappings it is possible to compute a num-
ber of Shannon entropies that afford calculation of mutual information, which repre-
sents that amount of information shared by two partitions of a set of objects. Taking
the ratio of the mutual information with the maximum possible mutual information
yields a quantity that measures the similarity of the two partitions. While the focus in
this work is directed towards small sets of objects the approach can be extended to
many more classes of problems that can be put into a similar form, which includes
many types of cheminformatic and biological problems. A number of scenarios are pre-
sented that illustrate the concept and indicate the broader class of problems that can be
handled by this method.
KEY WORDS: information theory, mutual information, partitions, property spaces,
Shannon entropy
∗
Corresponding author.
1
0259-9791/05/0700-0001/0 © 2005 Springer Science+Business Media, Inc.