Computational Complexity of NFA Minimization for Finite and Unary Languages Hermann Gruber 1 and Markus Holzer 2 1 Institut f¨ ur Informatik, Ludwig-Maximilians-Universit¨at M¨ unchen, Oettingenstraße 67, D-80538 M¨ unchen, Germany email: gruberh@tcs.ifi.lmu.de 2 Institut f¨ ur Informatik, Technische Universit¨at M¨ unchen, Boltzmannstraße 3, D-85748 Garching bei M¨ unchen, Germany email: holzer@in.tum.de Abstract. We investigate the computational complexity of the nonde- terministic ﬁnite automaton (NFA) minimization problem for ﬁnite and unary regular languages, if the input is speciﬁed by a deterministic ﬁ- nite state machine. While the general case of this problem is PSPACE- complete [13], it becomes theoretically easier when restricted to the afore- mentioned language families. It is easy to see that in both cases, an upper bound is Σ P 2 , the second level of the Polynomial Hierarchy. Concerning a respective lower bound, we show that the minimization problem for NFAs accepting ﬁnite languages is hard for the complexity class DP, which includes both NP and coNP, and is a subset of Σ P 2 . Moreover, we show that the corresponding problem for unary regular languages in general, i.e., not limited to the cyclic case, can be approximated in poly- nomial time within a performance ratio of O( √ n), where n is the number of states of the given deterministic ﬁnite state machine. This generalizes a result obtained recently for cyclic unary languages [6]. We also show that one cannot approximate the unary NFA minimization problem with o(n), if the input is an NFA, which is an optimal bound, unless P = NP. 1 Introduction Finite automata are one of the oldest and most intensely investigated compu- tational models. It is well known that deterministic and nondeterministic ﬁnite automata are computationally equivalent, and that nondeterministic ﬁnite au- tomata can oﬀer exponential state savings compared to deterministic ones [19]. On the other hand, minimizing deterministic ﬁnite automata (DFAs) can be car- ried out eﬃciently, whereas the state minimization problem for nondeterministic ﬁnite state automata (NFAs) is PSPACE-complete, even if the regular lan- guage is speciﬁed as a DFA [13]. This theoretical problem is quite relevant for applications where ﬁnite automata are involved, such as computational biology or natural language processing [4, 20], because it measures the amount of space needed to store the devices under consideration in memory. Common to most applications is that they have to deal with huge masses of data. The situation is