'Paul' vs 'John' require linear search.

sroth
Posts: 44
Joined: Mon Jul 23, 2007 11:21 am

'Paul' vs 'John' require linear search.

Post by sroth »

When I do a LIKP search on 'paul' i get the following message: <!-- 115 /search:671: Query `paul' would require post-processing: Index expression(s) do not match term `p' -->

When I do the same search on 'John', the results are good.

What could the difference be between these two searches?

select COUNT(*) cnt FROM Crawdaddy WHERE Content LIKEP ('paul');

select COUNT(*) cnt FROM Crawdaddy WHERE Content LIKEP ('john');
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

'Paul' vs 'John' require linear search.

Post by mark »

What were your index expressions when creating the index on Content?
sroth
Posts: 44
Joined: Mon Jul 23, 2007 11:21 am

'Paul' vs 'John' require linear search.

Post by sroth »

<SQL "set keepnoise=1;"></SQL>
<SQL "set minwordlen=2;"></SQL>
sroth
Posts: 44
Joined: Mon Jul 23, 2007 11:21 am

'Paul' vs 'John' require linear search.

Post by sroth »

The index was created using this statement after keepnoise and minwordlen were set.

<SQL "create metamorph inverted index Crawdaddy_Content on Crawdaddy(Content)"></SQL>
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

'Paul' vs 'John' require linear search.

Post by mark »

What about addexp?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

'Paul' vs 'John' require linear search.

Post by mark »

Or have you setup a custom locale such that "a" would be considered whitespace?
sroth
Posts: 44
Joined: Mon Jul 23, 2007 11:21 am

'Paul' vs 'John' require linear search.

Post by sroth »

I haven't explicity set addexp and I don't have a custom locale setup (at least not to my knowledge).
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

'Paul' vs 'John' require linear search.

Post by John »

You have minwordlen too low. It is stripping off the a and ul from Paul as suffixes, leaving only P. You could <SQL "set defsuffrm = 0"></SQL> to prevent the "a" being stripped, but generally you would want a more limited set of suffixes if you set minwordlen that low.
John Turnbull
Thunderstone Software
sroth
Posts: 44
Joined: Mon Jul 23, 2007 11:21 am

'Paul' vs 'John' require linear search.

Post by sroth »

Well, setting minwordlen=4 solved the issue with 'Paul' and "john' and 'bob' still work. I guess this a trial-and-error situation to find the best setting. Any advice? Thanks.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

'Paul' vs 'John' require linear search.

Post by mark »

minwordlen shouldn't be less than 4 or maybe 3. 4 or 5 is typical.
Post Reply