Partial document ranking by heuristic methods

Dik Lun Lee, Wai Yee Peter Wong

Research output: Chapter in Book/Conference Proceeding/ReportConference Paper published in a bookpeer-review

1 Citation (Scopus)

Abstract

In this paper, we study three methods for implementing the tf × idf ranking strategy with inverted files, where tf stands for term frequency and idf stands for inverse document frequency. The first one sorts the postings lists of the query terms by increasing length. It is the traditional sorting method used in the upperbound search algorithm. The second one sorts query terms based upon two parameters, namely the maximum tf of the postings list and the list length. The third one first requires each postings list to be sorted by decreasing t f value. It sorts disk pages, rather than postings lists, based upon three parameters, the maximum tf of the disk page, the length of the postings list and the number of document identifiers in the disk page. We show that the second and third methods are able to identify a large portion of top documents without using a large amount of disk page accesses. They outperform the first method by a large margin. The performance of these methods is demonstrated by experimental runs on four test collections made available with the SMART system.

Original languageEnglish
Title of host publicationAdvances in Computing and Information – ICCI 1991 - International Conference on Computing and Information, Proceedings
EditorsWaldemar W. Koczkodaj, Frank Dehne, Frantisek Fiala
PublisherSpringer Verlag
Pages231-239
Number of pages9
ISBN (Print)9783540540298
DOIs
Publication statusPublished - 1991
Externally publishedYes
Event3rd International Conference on Computing and Information, ICCI 1991 - Ottawa, Canada
Duration: 27 May 199129 May 1991

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume497 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference3rd International Conference on Computing and Information, ICCI 1991
Country/TerritoryCanada
CityOttawa
Period27/05/9129/05/91

Bibliographical note

Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 1991.

Fingerprint

Dive into the research topics of 'Partial document ranking by heuristic methods'. Together they form a unique fingerprint.

Cite this