Computational Complexity of NFA Minimization for Finite and Unary Languages Hermann Gruber 1 and Markus Holzer 2 1 Institut f¨ ur Informatik, Ludwig-Maximilians-Universit¨at M¨ unchen, Oettingenstraße 67, D-80538 M¨ unchen, Germany email: gruberh@tcs.ifi.lmu.de 2 Institut f¨ ur Informatik, Technische Universit¨at M¨ unchen, Boltzmannstraße 3, D-85748 Garching bei M¨ unchen, Germany email: holzer@in.tum.de Abstract. We investigate the computational complexity of the nonde- terministic finite automaton (NFA) minimization problem for finite and unary regular languages, if the input is specified by a deterministic fi- nite state machine. While the general case of this problem is PSPACE- complete [13], it becomes theoretically easier when restricted to the afore- mentioned language families. It is easy to see that in both cases, an upper bound is Σ P 2 , the second level of the Polynomial Hierarchy. Concerning a respective lower bound, we show that the minimization problem for NFAs accepting finite languages is hard for the complexity class DP, which includes both NP and coNP, and is a subset of Σ P 2 . Moreover, we show that the corresponding problem for unary regular languages in general, i.e., not limited to the cyclic case, can be approximated in poly- nomial time within a performance ratio of O( √ n), where n is the number of states of the given deterministic finite state machine. This generalizes a result obtained recently for cyclic unary languages [6]. We also show that one cannot approximate the unary NFA minimization problem with o(n), if the input is an NFA, which is an optimal bound, unless P = NP. 1 Introduction Finite automata are one of the oldest and most intensely investigated compu- tational models. It is well known that deterministic and nondeterministic finite automata are computationally equivalent, and that nondeterministic finite au- tomata can offer exponential state savings compared to deterministic ones [19]. On the other hand, minimizing deterministic finite automata (DFAs) can be car- ried out efficiently, whereas the state minimization problem for nondeterministic finite state automata (NFAs) is PSPACE-complete, even if the regular lan- guage is specified as a DFA [13]. This theoretical problem is quite relevant for applications where finite automata are involved, such as computational biology or natural language processing [4, 20], because it measures the amount of space needed to store the devices under consideration in memory. Common to most applications is that they have to deal with huge masses of data. The situation is