Faculty Publications

Ranked document selection

J. Ian Munro, David R. Cheriton School of Computer Science
Gonzalo Navarro, Universidad de Chile
Rahul Shah, Louisiana State University
Sharma V. Thankachan, University of Central Florida

Document Type

Article

Publication Date

4-6-2020

Abstract

Let D be a collection of string documents of n characters in total. The top-k document retrieval problem is to preprocess D into a data structure that, given a query (P,k), can return the k documents of D most relevant to pattern P. The relevance of a document d for a pattern P is given by a predefined ranking function w(P,d). Linear space and optimal query time solutions already exist for this problem. In this paper we consider a novel problem, document selection, in which a query (P,k) aims to report the kth document most relevant to P (instead of reporting all top-k documents). We present a data structure using O(nlogϵ⁡n) space, for any constant ϵ>0, answering selection queries in time O(log⁡k/log⁡log⁡n), and a linear-space data structure answering queries in time O(log⁡k), given the locus node of P in a (generalized) suffix tree of D. We also prove that it is unlikely that a succinct-space solution for this problem exists with poly-logarithmic query time, and that O(log⁡k/log⁡log⁡n) is indeed optimal within O(npolylogn) space for most text families. Finally, we present some additional space-time trade-offs exploring the extremes of those lower bounds.

Publication Source (Journal or Book title)

Theoretical Computer Science

First Page

149

Last Page

159

Recommended Citation

Munro, J., Navarro, G., Shah, R., & Thankachan, S. (2020). Ranked document selection. Theoretical Computer Science, 812, 149-159. https://doi.org/10.1016/j.tcs.2019.10.008

Download

COinS

Faculty Publications

Ranked document selection

Document Type

Publication Date

Abstract

Publication Source (Journal or Book title)

First Page

Last Page

Recommended Citation

Search

Browse

Author Corner

SPONSORED BY

Faculty Publications

Ranked document selection

Authors

Document Type

Publication Date

Abstract

Publication Source (Journal or Book title)

First Page

Last Page

Recommended Citation

Share

Search

Browse

Author Corner

SPONSORED BY