Document Type

Article

Publication Date

11-22-2007

Abstract

In this paper, we propose measures for compressed data structures, in which space usage is measured in a data-aware manner. In particular, we consider the fundamental dictionary problem on set data, where the task is to construct a data structure for representing a set S of n items out of a universe U = {0, ..., u - 1} and supporting various queries on S. We use a well-known data-aware measure for set data called gap to bound the space of our data structures. We describe a novel dictionary structure that requires gap + O (n log (u / n) / log n) + O (n log log (u / n)) bits. Under the RAM model, our dictionary supports membership, rank, and predecessor queries in nearly optimal time, matching the time bound of Andersson and Thorup's predecessor structure [A. Andersson, M. Thorup, Tight(er) worst-case bounds on dynamic searching and priority queues, in: ACM Symposium on Theory of Computing, STOC, 2000], while simultaneously improving upon their space usage. We support select queries even faster in O (log log n) time. Our dictionary structure uses exactly gap bits in the leading term (i.e., the constant factor is 1) and answers queries in near-optimal time. When seen from the worst-case perspective, we present the first O (n log (u / n))-bit dictionary structure that supports these queries in near-optimal time under the RAM model. We also build a dictionary which requires the same space and supports membership, select, and partial rank queries even more quickly in O (log log n) time. We go on to show that for many (real-world) datasets, data-aware methods lead to a worthwhile compression over combinatorial methods. To the best of our knowledge, these are the first results that achieve data-aware space usage and retain near-optimal time. © 2007.

Publication Source (Journal or Book title)

Theoretical Computer Science

First Page

313

Last Page

331

Plum Print visual indicator of research metrics
PlumX Metrics
  • Citations
    • Citation Indexes: 36
  • Usage
    • Abstract Views: 1
  • Captures
    • Readers: 12
see details

Share

COinS