Categorical Range Maxima Queries
Document Type
Conference Proceeding
Publication Date
1-1-2014
Abstract
Given an array A[1.n] of n distinct elements from the set {1, 2, ., n} a range maximum query RMQ(a, b) returns the highest element in A[a.b] along with its position. In this paper, we study a generalization of this classical problem called Categorical Range Maxima Query (CRMQ) problem, in which each element A[i] in the array has an associated category (color) given by C[i] ∈ [σ]. A query then asks to report each distinct color c appearing in C[a.b] along with the highest element (and its position) in A[a.b] with color c. Let pc denote the position of the highest element in A[a.b] with color c. We investigate two variants of this problem: a threshold version and a top-κ version. In threshold version, we only need to output the colors with A[p c] more than the input threshold τ, whereas top-κ variant asks for κ colors with the highest A[pc] values. In the word RAM model, we achieve linear space structure along with O(κ) query time, that can report colors in sorted order of A[·]. In external memory, we present a data structure that answers queries in optimal O(1+ κ/B ) I/O's using almost-linear O(n log * n) space, as well as a linear space data structure with O(log * n + κ/B) query I/Os. Here κ represents the output size, log* n is the iterated logarithm of n and B is the block size. CRMQ has applications to document retrieval and categorical range reporting-giving a one-shot framework to obtain improved results in both these problems. Our results for CRMQ not only improve the existing best known results for three-sided categorical range reporting but also overcome the hurdle of maintaining color uniqueness in the output set. Copyright 2014 ACM.
Publication Source (Journal or Book title)
Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems
First Page
266
Last Page
277
Recommended Citation
Patil, M., Thankachan, S., Shah, R., Nekrich, Y., & Vitter, J. (2014). Categorical Range Maxima Queries. Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 266-277. https://doi.org/10.1145/2594538.2594557