Ranked document selection

Document Type

Conference Proceeding

Publication Date

1-1-2014

Abstract

Let D be a collection of string documents of n characters in total. The top-k document retrieval problem is to preprocess D into a data structure that, given a query (P,k), can return the k documents of D most relevant to pattern P. The relevance of a document d for a pattern P is given by a predefined ranking function w(P,d). Linear space and optimal query time solutions already exist for this problem. In this paper we consider a novel problem, document selection queries, which aim to report the kth document most relevant to P (instead of reporting all top-k documents). We present a data structure using O(n log ε n) space, for any constant ε > 0, answering selection queries in time O(log k / log log n), and a linear-space data structure answering queries in time O(log k), given the locus node of P in a (generalized) suffix tree of D. We also prove that it is unlikely that a succinct-space solution for this problem exists with poly-logarithmic query time. © 2014 Springer International Publishing.

Publication Source (Journal or Book title)

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

First Page

344

Last Page

356

This document is currently not available here.

Plum Print visual indicator of research metrics
PlumX Metrics
  • Citations
    • Citation Indexes: 3
  • Usage
    • Abstract Views: 1
  • Captures
    • Readers: 1
see details

Share

COinS