The Wavelet Matrix Francisco Claude 1,⋆ and Gonzalo Navarro 2,⋆⋆ 1 David R. Cheriton School of Computer Science, University of Waterloo. 2 Department of Computer Science, University of Chile. Abstract. The wavelet tree (Grossi et al., SODA 2003) is nowadays a popular succinct data structure for text indexes, discrete grids, and many other applications. When it has many nodes, a levelwise representation proposed by M¨ akinen and Navarro (LATIN 2006) is preferable. We pro- pose a diﬀerent arrangement of the levelwise data, so that the bitmaps are shuﬄed in a diﬀerent way. The result can no more be called a wavelet tree, and we dub it wavelet matrix. We demonstrate that the wavelet ma- trix is simpler to build, simpler to query, and faster in practice than the levelwise wavelet tree. This has a direct impact on many applications that use the levelwise wavelet tree for diﬀerent purposes. 1 Introduction The wavelet tree [20] is a data structure designed to represent a sequence S[1,n] over alphabet [0,σ) and answer some queries on it. The following queries are suﬃcient to provide eﬃcient data structures for many applications: – access(S, i) returns S[i]. – rank a (S, i) returns the number of occurrences of symbol a in S[1,i]. – select a (S, j ) returns the position in S of the j -th occurrence of symbol a. A wavelet tree is a balanced binary tree with σ leaves and σ − 1 inter- nal nodes, each of which holds a bitmap. In its most basic form, the bitmaps add up to n⌈lg σ⌉ bits. Those bitmaps are equipped with sublinear-size struc- tures to carry out binary rank and select operations. Considering carefully implemented pointers of lg n bits for the tree, the basic wavelet tree requires n lg σ + o(n lg σ)+ O(σ lg n) bits. This is asymptotically equivalent to a plain representation of S, yet the wavelet tree is able to solve the three operations in time O(lg σ). However, in applications where the alphabet is large, the O(σ lg n) term may become dominant (both in theory and in practice). M¨akinen and Navarro [24,26] showed that it is possible to concatenate all the bitmaps of each level and still simulate the tree navigation using rank and select operations on the concatenated bitmaps. The size was reduced to n lg σ + o(n lg σ) bits. While in theory the complexities stayed the same, in practice one needs three times the ⋆ Funded by Google U.S./Canada PhD Fellowship. ⋆⋆ Funded in part by Millennium Nucleus Information and Coordination in Networks ICM/FIC P10-024F, Chile.