Morpheme Stripping and Index Build

Post Reply
wdavies
Posts: 19
Joined: Mon Dec 17, 2001 5:15 pm

Morpheme Stripping and Index Build

Post by wdavies »

Hi,

I just tried switching on Morpheme Stripping (aka Stemming), by setting minwordlen to 5.

Anyway, it seems to cause a BIG performance hit (order of magnitude). My guess is that it is because the index wasn't built with this set?

Is there a way to do this ? Should I just set minwordlen to the same in the indexing script?

Cheers,
Winton
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Morpheme Stripping and Index Build

Post by John »

The minwordlen setting does not affect the index build process. Were you testing it with words with a lot of derivatives, or some very common ones? For most words there are not that many derivatives to see that kind of performance impact.
John Turnbull
Thunderstone Software
Post Reply