A fast near-optimal min-# polygonal approximation of digitized curves Alexander Kolesnikov 1 and Pasi Fränti 2 1) Institute of Automation and Electrometry Pr.Ak.Koptyuga, 1, Novosibirsk, 630090, Russia Email: kolesnikov@iae.nsk.su 2) Dept of Computer Science, University of Joensuu, BOX 111, FIN 80101, Joensuu, F nland i Email: franti@cs.joensuu.fi ABSTRACT We propose a fast near-optimal algorithm for solving the problem of min-# polygonal approximation of digitized curves. The algorithm consists of two steps. It first finds a reference approximation with minimum number of segments for a given error tolerance by using L error metrics. It then improves the quality of the approximation by a reduced-search dynamic programming with additive L 2 error measure. The algorithm is tailored for high- quality vectorization of digitized curves. Keywords: vectorization, polygonal approximation, min-# problem, shortest path, dynamic programming 1. INTRODUCTION We consider the problem of polygonal approximation of open digitized curves for high-quality vectorization tasks. The task is defined as optimal polygonal approximation of N vertices with the minimum number of linear segments M that satisfies a given error tolerance. It is known as the min-# problem. The problem is closely related to the min- problem, which aims at minimizing the approximation error for a given number of segments M. This problem can be solved by graph theory methods as proposed in O(N 2 logN) time [1]. The problem can also be solved by dynamic programming algorithm in O(N 2 M) time as proposed in [2]. Salotti has improved this approach by a method that works in O(N 2 ) time [3]. Schuster and Katsagellos [4] have proposed another optimal algorithm with the time complexity of O(N 2 ) based on the Lagrange multiplier method. All the above algorithms are optimal but they have quadratic or cubic time complexity, which makes them impractical for large number of vertices N. In a recent paper [5], we have introduced fast near-optimal algorithm for the min- problem that has time complexity remarkably less than O(N 2 ). Several algorithms for the min-# problem also exist. Graph theory method has been proposed in [6], and the complexity of this algorithm was then reduced to O(N 2 ) in [1,4,7]. The dynamic programming approach also can be used to solve the problem, but complexity of the algorithm is O(N 3 ) [2, 8]. A fast near-optimal algorithm was proposed for the min-# problem in [9]. The algorithm provides solution with minimum number of segments M for a given error tolerance d T : d d T. . The approximation error d is defined as the maximum Euclidean distance from the vertices to the approximating segments, and it is the so-called L error metrics. The algorithm has been tailored especially to polygons with low number of segments, which is suitable as shape signatures in image retrieval from multimedia databases. Technically, the algorithm in [9] can also be used for polygonal approximation in the vectorization tasks. However, the error metrics L is inferior to L 2 if we are dealing with approximation with low error tolerance for high-quality vectorization task. The L 2 error metrics (corresponding to the means squared error E) has also been considered in [9] but only as local distortion measure, and not as global cost function for the whole curve. Thus, it is expected that the additive error metrics L 2 provides better results for the same number of approximating segments M in the vectorization application, but the error measure E can hardly be used as the error tolerance measure in the min-# problem because of its additive characteristic. To solve this dilemma, we propose to use the L metrics as the input control parameter d T , and the additive error measure E with metrics L 2 as the cost function in the optimization. In polygonal approximation of the lines in engineering drawings, maps, schemes, etc., the distortion tolerance d T can be set up to half of the line width [10], or to 1-2 pixels for polygonal approximation of region borders in segmentation tasks. In this work, we generalize the near-optimal algorithm solving the min- problem [5] to solve the min-# problem, too. We formulate the min-# problem in two forms: strong and weak. The strong form means that the optimal solution with error metrics L 2 has to satisfy the constraint of the maximum distortion: d d T. . The weak form means that we are looking for an optimal solution that takes no account of the strong constraint on the distortion; it merely aims at finding solution for which d d T .