LF successor: Compact space indexing for order-isomorphic pattern matching
Document Type
Conference Proceeding
Publication Date
7-1-2021
Abstract
Two strings are order isomorphic iff the relative ordering of their characters is the same at all positions. For a given text T[1, n] over an ordered alphabet of size σ, we can maintain an order-isomorphic suffix tree/array in O(n log n) bits and support (order-isomorphic) pattern/substring matching queries efficiently. It is interesting to know if we can encode these structures in space close to the text's size of n log σ bits. We answer this question positively by presenting an O(n log σ)-bit index that allows access to any entry in order-isomorphic suffix array (and its inverse array) in tSA = O(log2 n/ log σ) time. For any pattern P given as a query, this index can count the number of substrings of T that are order-isomorphic to P (denoted by occ) in O((|P| log σ + tSA) log n) time using standard techniques. Also, it can report the locations of those substrings in additional O(OCC · tSA) time.
Publication Source (Journal or Book title)
Leibniz International Proceedings in Informatics, LIPIcs
Recommended Citation
Ganguly, A., Patel, D., Shah, R., & Thankachan, S. (2021). LF successor: Compact space indexing for order-isomorphic pattern matching. Leibniz International Proceedings in Informatics, LIPIcs, 198 https://doi.org/10.4230/LIPIcs.ICALP.2021.71