Faster compressed dictionary matching

Document Type

Conference Proceeding

Publication Date

11-24-2010

Abstract

Given a set D of d patterns, the dictionary matching problem is to index D such that for any query text T, we can locate the occurrences of any pattern within T efficiently. When D contains a total of n characters drawn from an alphabet of size σ, Hon et al. (2008) gave an nHk(D) + o(n log σ)-bit index which supports a query in O(|T| (logε n+ log d) + occ) time, where ε > 0 and Hk(D) denotes the kth order entropy of D. Very recently, Belazzougui (2010) proposed an elegant scheme, which takes n log σ +O(n) bits of index space and supports a query in optimal O(|T|+occ) time. In this paper, we provide connections between Belazzougui's index and the XBW compression of Ferragina et al. (2005), and show that Belazzougui's index can be slightly modified to be stored in nH k(D) + O(n) bits, while query time remains optimal; this improves the compressed index by Hon et al. (2008) in both space and time. © 2010 Springer-Verlag.

Publication Source (Journal or Book title)

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

First Page

191

Last Page

200

This document is currently not available here.

Share

COinS