Understanding Document Analysis and Understanding (through Modeling)
Bertin Klein, Stefan Agne
Deutsches Forschungszentrum f¨ ur
K¨ unstliche Intelligenz (DFKI)
Erwin-Schr¨ odinger-Straße 57
67608 Kaiserslautern
klein,agne @dfki.de
Andrew D. Bagdanov
Faculty of Science
University of Amsterdam
Kruislaan 403
1098 SJ Amsterdam
andrew@science.uva.nl
Abstract
We turn to the viewpoint of users of a DAU system. Out of
the view of users we sketch a picture of “Document Anal-
ysis and Understanding” (DAU), only a simple division of
DAU into six sub-tasks, and consider this a model of DAU.
We provide one elaborate example of a use of the model: the
module design of the commercially successful DAU system
smartFIX. We argue that such modeling of DAU can be
of benefit to the whole field of DAU.
1. Motivation
What is our account of DAU problems, how thus do we
approach them?
In order to construct a system for “Document Analysis
and Understanding” (DAU) [5] we had to become con-
scious of how the system-users view the problem. Inter-
nally we said: “The DAU community builds first class en-
gines, however, we need a car now! People drive cars not
engines.” In the present article we sketch our approach: a
little model of DAU.
Great thinkers have substantiated the purpose of a model.
Peirce [2, 4] pointed out that if there exists a notion, there
is always a reason for it. In other words, things are only la-
beled with a new and different notion when they need to be
handled differently. If one has to deal with something, e.g.
DAU, it is thus important to become aware of all the “no-
tions” implied. Here we sensed a gap in DAU.
1
Referring
to Nagy: there are some more questions to find and ask [7].
1
Not only in DAU: “Many IT developers have a tendency to be
technology-driven: they have a nice tool, such as knowledge technology
(but many other IT examples exist as well), and want to push this technol-
ogy into the business. In contrast, we want to emphasize the need to be
problem-driven: first identify current knowledge-related problems (apart
of the knowledge management process) and from there recommend solu-
tions [...].” [1]
Figure 1. Researchers in DAU need to under-
stand the world of the owners of DAU prob-
lems.
Modeling improved our “understanding” of DAU, the
precondition for our acting, and thus it had very practical
consequences. To us it appears true that: “Usually, knowl-
edge providers, users, and decision makers are very dif-
ferent persons with very different interests.” [1]. However,
modeling helped to negotiate the different views. We thus
aim to recommend modeling (i.e. construction and criticism
of models, where our model could be one little part) in order
to advance three central and challenging issues in DAU:
better visibility of DAU,
further research successes in DAU technology,
visions for the future of DAU.
In the design of the DAU system smartFIX, our model
contributed to our concerns related to these issues. We got
clues about a better appearance to DAU users. We under-
stood and solved technological challenges. We determined
yet unsolved scenarios which seem solvable to us in the fu-
ture.
Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR’03)
0-7695-1960-1/03 $17.00 © 2003 IEEE