360 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 10, NO. 3, JUNE 2002
A Fuzzy MHT Algorithm Applied to Text-Based
Information Tracking
Santiago Aja-Fernández, Carlos Alberola-López, Member, IEEE, and George V. Cybenko
Abstract—In this paper, we carry out a detailed analysis of
a fuzzy version of Reid’s classical multiple hypothesis tracking
(MHT) algorithm. Our fuzzy version is based on well-known
fuzzy feedback systems, but the fact that the system we describe
is specialized for likelihood discrimination makes this study par-
ticularly novel. We discuss several techniques for rule activation.
One of them, namely, the sum–product, seems particularly useful
for likelihood management and its linearity makes it tractable for
further analysis. Our analysis is performed in two stages. First,
we demonstrate that, with appropriately chosen rules, our system
can discriminate the correct hypothesis. Second, the steady-state
behavior with constant input is characterized analytically. This
enables us to establish the optimality of the sum–product method
and it also gives a simple procedure to predict the system’s
behavior as a function of the rule base. We believe this fact can
be used to devise a simple procedure for fine-tuning the rule base
according to the system designer needs. The application driving
our fuzzy MHT implementation and analysis is the tracking of
natural language text-based messages. That application is used as
an example throughout the paper.
Index Terms—Fuzzy feedback system, hypotheses discrimina-
tion, information tracking, multiple hypothesis tracking (MHT) al-
gorithm, natural language processing.
I. INTRODUCTION
N
ATURAL language messages are present in many infor-
mation processing and analysis applications. However,
to-date most systems for natural language processing have been
used for database querying or machine translation. New and
more powerful text processing techniques need to be developed
and analyzed to handle other important applications that
require correlation of text-based messages such as intelligence
analysis, computer security incidents databases, and customer
service reporting.
These applications have several common attributes: they
involve tracking possibly ambiguous reports generated by
Manuscript received April 6, 2001; revised October 6, 2001 and December 5,
2001. The work of S. Aja-Fernández and C. Alberola-López was supported in
part by the Comisión Interministerial de Ciencia y Tecnología under Research
Grants TIC97-0772 and 1FD97-0881 and by Junta de Castilla y León under
Research Grants VA78/99 and VA91/01. The work of G. V. Cybenko was sup-
ported by the National Science Foundation under Grant 9813744 and DARPA
Grant F30602-98-2-0107.
S. Aja-Fernández is with the Department of Teoría de la Señal y Comuni-
caciones e Ingenier a Telemática, University of Valladolid, 47011 Valladolid,
Spain.
C. Alberola-López is with the Department of Teoría de la Señal y Comuni-
caciones e Ingeniería Telemática of the University of Valladolid, Spain, ETSI
Telecomunicación, Campus Miguel Delibes, 47011 Valladolid, Spain (e-mail:
caralb@tel.uva.es).
G. V. Cybenko is with the Thayer School of Engineering, Dartmouth College,
Hanover, NH 03755 USA.
Publisher Item Identifier S 1063-6706(02)04829-4.
different observers over time (in this context tracking means
finding which messages deal with the same pieces of informa-
tion and, therefore, they should be correlated somehow over
time). Each such application also tends to be narrow in scope
so a few important keywords should be carefully searched for
and processed. These applications areas are all in need of more
advanced automatic analysis techniques given the increasing
amount of networked text-based information available to them.
TEXTTRACK, described in [1], is a software system whose
goals are to apply advanced signal processing tracking con-
cepts to natural language processing. TEXTTRACK addresses the
problems of correlating and tracking observations of multiple
moving vehicles reported by natural language messages that
are generated by multiple observers asynchronously over time.
The system has demonstrated that such problems can be tackled
using relatively mature concepts from radar signal processing,
namely the multiple hypothesis tracking (MHT) algorithm [20].
The prototype accepts simple natural language messages about
vehicle types and locations, correlates the messages and asso-
ciates groups of messages into the most likely tracks based on
a succession of positions. The correlation procedure is solved
in two steps: first, an appropriately modified, but still classical,
Bayesian framework is used to handle the ambiguity in natural
language descriptions. A formal theorem shows that under very
mild conditions, the correct solution is eventually achieved. The
second step uses a fuzzy inference engine (FIE), specifically, a
fuzzy version of the classical Bayesian Reid’s multiple hypoth-
esis tracking algorithm. Since the purpose is to model natural
language ambiguity, linguistic variables (i.e., computing with
words in Zadeh’s terminology [24]) are a natural choice for this
purpose. However, [1] does not include a rigorous analytical
study of the TEXTTRACK system. That work presented an intu-
itive argument for the system’s effectiveness and was illustrated
with several working examples.
In this paper, we give the fuzzy MHT algorithm originally
developed in [1] a solid theoretical foundation by analytically
characterizing the FIE on which the algorithm is based. Due to
the fact that its mathematical characterization is application-in-
dependent, a natural byproduct of this paper is the broadening of
the range of possible applications of the text-based MHT philos-
ophy. That is, not only is it possible to track mobile man-made
objects, but we will see it is possible to handle information about
any time-varying phenomenon, as long as the phenomenon can
be described by means of a few keywords, and the phenomenon
itself is statistically causal in the sense that the distribution of
future states is statistically dependent on past observed states.
The principal ingredient of the FIE arising in the MHT
algorithm is a variant of well-known fuzzy feedback systems
1063-6706/02$17.00 © 2002 IEEE