Two-Dimensional Longest Common Extension Queries in Compact Space
Document Type
Conference Proceeding
Publication Date
2-24-2025
Abstract
For a length n text over an alphabet of size σ, we can encode the suffix tree data structure in O(n log σ) bits of space. It supports suffix array (SA), inverse suffix array (ISA), and longest common extension (LCE) queries in O(logϵσ n) time, which enables efficient pattern matching; here ϵ > 0 is an arbitrarily small constant. Further improvements are possible for LCE queries, where O(1) time queries can be achieved using an index of space O(n log σ) bits. However, compactly indexing a two-dimensional text (i.e., an n × n matrix) has been a major open problem. We show progress in this direction by first presenting an O(n2 log σ)-bit structure supporting LCE queries in near O((logσ n)2/3) time. We then present an O(n2 log σ + n2 log log n)-bit structure supporting ISA queries in near O(log n · (logσ n)2/3) time. Within a similar space, achieving SA queries in poly-logarithmic (even strongly sub-linear) time is a significant challenge. However, our O(n2 log σ + n2 log log n)-bit structure can support SA queries in O(n2/(σ log n)c) time, where c is an arbitrarily large constant, which enables pattern matching in time faster than what is possible without preprocessing. We then design a repetition-aware data structure. The δ2D compressibility measure for two-dimensional texts was recently introduced by Carfagna and Manzini [SPIRE 2023]. The measure ranges from 1 to n2, with smaller δ2D indicating a highly compressible two-dimensional text. The current data structure utilizing δ2D allows only element access. We obtain the first structure based on δ2D for LCE queries. It takes Õ(n5/3 + n8/5δ21D/5) space and answers queries in O(log n) time.
Publication Source (Journal or Book title)
Leibniz International Proceedings in Informatics, LIPIcs
Recommended Citation
Ganguly, A., Gibney, D., Shah, R., & Thankachan, S. (2025). Two-Dimensional Longest Common Extension Queries in Compact Space. Leibniz International Proceedings in Informatics, LIPIcs, 327 https://doi.org/10.4230/LIPIcs.STACS.2025.38