Please explain "potential" in the context below

Post by **Thunderstone** » Tue May 13, 1997 7:58 pm

I would like to redirect a user query to you:

% gw -h
Webinator WWW Site Indexer Version 1.3 (Commercial)
Copyright(c) 1995,1996 Thunderstone EPI Inc.
Release: 19960816

When searching for a string (within:document,word forms: exact), the
bottom of the results has a for

"Hits 11-20 of a potential 36 hits."

Clicking on this link gives a page with hits 11-17. This appears quite
misleading/confusing to the user who wishes to classify this a "BUG"

Is this behavior the result of code design?

Post by **Thunderstone** » Wed May 14, 1997 8:26 am

The word potential means that it is the possible number of matches.
The number is exact when the proximity is set to "page". The engine
only knows that the terms are co-resident within rough proximity to
each other. This is how it achieves small fast indexes. Actual
proximity is resolved in a post examination of documents that
"potentially" match.

In order to produce the actual number we'd have to look at them
all, which in some cases could take a long timer. We'd rather
get the answers faster with a little fuzziness in the count.
A close examination of other engines would reveal similar behavior
with far less search efficacy. We could produce a totally useless
word frequency count instead, or truncate at 200 hits like most engines
do.