Fuzzy Search Time

Tyler
Posts: 3
Joined: Mon Sep 10, 2001 2:19 pm

Fuzzy Search Time

Post by Tyler »

We're having timeout issues when doing fuzzy searching. Some details:
*Our timeout is set to 500 seconds (~8 minutes)
*The database is ~500,000 records
*The database contains OCR
*The OCR field is set up as a BLOB datatype

A user is attempting to search '%98thomas' on the OCR field when they experienced the timeout. We suggested that they provide an additional search parameter to limit the number of records, which they did. That query would've limited the records down to 150,000, but still timed out when doing the fuzzy searching.

Questions:
1) Is it normal for fuzzy searching to take this long?
2) Why does it take so long? Is it scanning the data directly?
3) Does the OCR being BLOB have any effect here?
4) Any suggestions for increasing fuzzy search speed?
User avatar
Kai
Site Admin
Posts: 1272
Joined: Tue Apr 25, 2000 1:27 pm

Fuzzy Search Time

Post by Kai »

1) Fuzzy searching is not indexable and must linearly search every potential row. This can take significantly longer than an indexed search, depending on the relative percentage of records to be searched (100% if it's the only query term) and the size of the data.

2) While 8 minutes is rather slow, it is not unexpected if the table is several GB or more, and/or the machine is under load (servicing many queries). What is the size of the table (especially the .blb file size)? What is the typical query rate?

3) It doesn't make much difference; the data still has to be linearly searched.

4) Narrow the query further with more required (+ or likepallmatch on) indexable terms.