Succinct non-overlapping indexing
Document Type
Conference Proceeding
Publication Date
1-1-2015
Abstract
Given a text T having n characters, we consider the nonoverlapping indexing problem defined as follows: pre-process T into a data-structure, such that whenever a pattern P comes as input, we can report a maximal set of non-overlapping occurrences of P in T. The best known solution for this problem takes linear space, in which a suffix tree of T is augmented with O(n)-word data structures. A query P can be answered in optimal O(|P| + nocc) time, where nocc is the output size [Cohen and Porat, ISAAC 2009]. We present the following new result: let CSA (not necessarily a compressed suffix array) be an index of T that can compute (i) the suffix range of P in search(P) time, and (ii) a suffix array or an inverse suffix array value in tSA time; then by using CSA alone, we can answer a query P in O(search(P) + nocc · tSA) time. Additionally, we present an improved result for a generalized version of this problem called range non-overlapping indexing.
Publication Source (Journal or Book title)
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
First Page
185
Last Page
195
Recommended Citation
Ganguly, A., Shah, R., & Thankachan, S. (2015). Succinct non-overlapping indexing. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9133, 185-195. https://doi.org/10.1007/978-3-319-19929-0_16