Understanding Document Analysis and Understanding (through Modeling) Bertin Klein, Stefan Agne Deutsches Forschungszentrum f¨ ur unstliche Intelligenz (DFKI) Erwin-Schr¨ odinger-Straße 57 67608 Kaiserslautern klein,agne @dfki.de Andrew D. Bagdanov Faculty of Science University of Amsterdam Kruislaan 403 1098 SJ Amsterdam andrew@science.uva.nl Abstract We turn to the viewpoint of users of a DAU system. Out of the view of users we sketch a picture of “Document Anal- ysis and Understanding” (DAU), only a simple division of DAU into six sub-tasks, and consider this a model of DAU. We provide one elaborate example of a use of the model: the module design of the commercially successful DAU system smartFIX. We argue that such modeling of DAU can be of benefit to the whole field of DAU. 1. Motivation What is our account of DAU problems, how thus do we approach them? In order to construct a system for “Document Analysis and Understanding” (DAU) [5] we had to become con- scious of how the system-users view the problem. Inter- nally we said: “The DAU community builds first class en- gines, however, we need a car now! People drive cars not engines.” In the present article we sketch our approach: a little model of DAU. Great thinkers have substantiated the purpose of a model. Peirce [2, 4] pointed out that if there exists a notion, there is always a reason for it. In other words, things are only la- beled with a new and different notion when they need to be handled differently. If one has to deal with something, e.g. DAU, it is thus important to become aware of all the “no- tions” implied. Here we sensed a gap in DAU. 1 Referring to Nagy: there are some more questions to find and ask [7]. 1 Not only in DAU: “Many IT developers have a tendency to be technology-driven: they have a nice tool, such as knowledge technology (but many other IT examples exist as well), and want to push this technol- ogy into the business. In contrast, we want to emphasize the need to be problem-driven: first identify current knowledge-related problems (apart of the knowledge management process) and from there recommend solu- tions [...].” [1] Figure 1. Researchers in DAU need to under- stand the world of the owners of DAU prob- lems. Modeling improved our “understanding” of DAU, the precondition for our acting, and thus it had very practical consequences. To us it appears true that: “Usually, knowl- edge providers, users, and decision makers are very dif- ferent persons with very different interests.” [1]. However, modeling helped to negotiate the different views. We thus aim to recommend modeling (i.e. construction and criticism of models, where our model could be one little part) in order to advance three central and challenging issues in DAU: better visibility of DAU, further research successes in DAU technology, visions for the future of DAU. In the design of the DAU system smartFIX, our model contributed to our concerns related to these issues. We got clues about a better appearance to DAU users. We under- stood and solved technological challenges. We determined yet unsolved scenarios which seem solvable to us in the fu- ture. Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR’03) 0-7695-1960-1/03 $17.00 © 2003 IEEE