Document Type
Conference Proceeding
Publication Date
9-1-2008
Abstract
The past few years have witnessed several exciting results on compressed representation of a string T that supports efficient pattern matching, and the space complexity has been reduced to |T|Hk(T) +o(|T| log σ) bits [8, 10], where Hk(T) denotes the kth-order empirical entropy of T, and a is the size of the alphabet. In this paper we study compressed representation for another classical problem of string indexing, which is called dictionary matching in the literature. Precisely, a collection V of strings (called patterns) of total length n is to be indexed so that given a text T, the occurrences of the patterns in T can be found efficiently. In this paper we show how to exploit a sampling technique to compress the existing O(n)-word index to an (nHk(D) + o(n log σ))-bit index with only a small sacrifice in search time. © 2008 IEEE.
Publication Source (Journal or Book title)
Data Compression Conference Proceedings
First Page
23
Last Page
32
Recommended Citation
Hon, W., Shah, R., Lam, T., Tam, S., & Vitter, J. (2008). Compressed index for dictionary matching. Data Compression Conference Proceedings, 23-32. https://doi.org/10.1109/DCC.2008.62