Document Type
Article
Publication Date
3-4-2013
Abstract
Given a set D of d patterns, the dictionary matching problem is to index D such that for any query text T, we can locate the occurrences of any pattern within T efficiently. When D contains a total of n characters drawn from an alphabet of size σ, Hon et al. (2008) [12] gave an nHk(D) +o(nlogσ)-bit index which supports a query in O(T(log εn+logd)+occ) time, where ε>0 and Hk(D) denotes the kth-order entropy of D. Very recently, Belazzougui (2010) [3] has proposed an elegant scheme, which takes nlogσ+O(n) bits of index space and supports a query in optimal O(T+occ) time. In this paper, we provide connections between Belazzougui's index and the XBW compression of Ferragina and Manzini (2005) [8], and show that Belazzougui's index can be slightly modified to be stored in nHk(D)+O(n) bits, while query time remains optimal; this improves the compressed index by Hon et al. (2008) [12] in both space and time. © 2013 Elsevier B.V. All rights reserved.
Publication Source (Journal or Book title)
Theoretical Computer Science
First Page
113
Last Page
119
Recommended Citation
Hon, W., Ku, T., Shah, R., Thankachan, S., & Vitter, J. (2013). Faster compressed dictionary matching. Theoretical Computer Science, 475, 113-119. https://doi.org/10.1016/j.tcs.2012.10.050