Faster compressed dictionary matching
Document Type
Conference Proceeding
Publication Date
11-24-2010
Abstract
Given a set D of d patterns, the dictionary matching problem is to index D such that for any query text T, we can locate the occurrences of any pattern within T efficiently. When D contains a total of n characters drawn from an alphabet of size σ, Hon et al. (2008) gave an nHk(D) + o(n log σ)-bit index which supports a query in O(|T| (logε n+ log d) + occ) time, where ε > 0 and Hk(D) denotes the kth order entropy of D. Very recently, Belazzougui (2010) proposed an elegant scheme, which takes n log σ +O(n) bits of index space and supports a query in optimal O(|T|+occ) time. In this paper, we provide connections between Belazzougui's index and the XBW compression of Ferragina et al. (2005), and show that Belazzougui's index can be slightly modified to be stored in nH k(D) + O(n) bits, while query time remains optimal; this improves the compressed index by Hon et al. (2008) in both space and time. © 2010 Springer-Verlag.
Publication Source (Journal or Book title)
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
First Page
191
Last Page
200
Recommended Citation
Hon, W., Ku, T., Shah, R., Thankachan, S., & Vitter, J. (2010). Faster compressed dictionary matching. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 6393 LNCS, 191-200. https://doi.org/10.1007/978-3-642-16321-0_19