Categorical Range Maxima Queries

Document Type

Conference Proceeding

Publication Date

1-1-2014

Abstract

Given an array A[1.n] of n distinct elements from the set {1, 2, ., n} a range maximum query RMQ(a, b) returns the highest element in A[a.b] along with its position. In this paper, we study a generalization of this classical problem called Categorical Range Maxima Query (CRMQ) problem, in which each element A[i] in the array has an associated category (color) given by C[i] ∈ [σ]. A query then asks to report each distinct color c appearing in C[a.b] along with the highest element (and its position) in A[a.b] with color c. Let pc denote the position of the highest element in A[a.b] with color c. We investigate two variants of this problem: a threshold version and a top-κ version. In threshold version, we only need to output the colors with A[p c] more than the input threshold τ, whereas top-κ variant asks for κ colors with the highest A[pc] values. In the word RAM model, we achieve linear space structure along with O(κ) query time, that can report colors in sorted order of A[·]. In external memory, we present a data structure that answers queries in optimal O(1+ κ/B ) I/O's using almost-linear O(n log * n) space, as well as a linear space data structure with O(log * n + κ/B) query I/Os. Here κ represents the output size, log* n is the iterated logarithm of n and B is the block size. CRMQ has applications to document retrieval and categorical range reporting-giving a one-shot framework to obtain improved results in both these problems. Our results for CRMQ not only improve the existing best known results for three-sided categorical range reporting but also overcome the hurdle of maintaining color uniqueness in the output set. Copyright 2014 ACM.

Publication Source (Journal or Book title)

Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

First Page

266

Last Page

277

This document is currently not available here.

Share

COinS