aagarwal
Posts: 5 Joined: Tue Feb 12, 2008 7:05 pm
Post
by aagarwal » Tue Feb 12, 2008 8:21 pm
We have 2 records with title 'horses'. with minwordlen=3, a count of title like 'horse' returns 0. count of title like 'horses' returns 2.
With a minwordlen=4, both counts return 2.
I am using TSQL with the default settings for suffixproc (on) and suffix list.
We can not set the minwordlen to 4 because we also want the words "dog" and "dogs" to show up in the same manner.
Should we be setting the minwordlen dynamically - depending on the size of the word?
John
Site Admin
Posts: 2622 Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:
Post
by John » Tue Feb 12, 2008 9:04 pm
What is your suffix list? You may also need to change the default suffix removal setting defsuffrm.
John Turnbull
Thunderstone Software
aagarwal
Posts: 5 Joined: Tue Feb 12, 2008 7:05 pm
Post
by aagarwal » Tue Feb 12, 2008 9:15 pm
default suffix list - i am using tsql to get the counts. default defsuffrm.
When using vortex for the same, the defsuffrm=0, suffix list is "'", "s", "es".
mark
Site Admin
Posts: 5519 Joined: Tue Apr 25, 2000 6:56 pm
Post
by mark » Wed Feb 13, 2008 10:42 am
You could add "se" to the suffix list. Though it could cause the occasional extra hit. Such as "treats" matching "treatise" with defsuffrm on. With defsuffrm off I doubt it would hurt at all.
aagarwal
Posts: 5 Joined: Tue Feb 12, 2008 7:05 pm
Post
by aagarwal » Wed Feb 13, 2008 11:42 am
Then how come it is working with minwordlen=4?
I have tried looking in the documentation, can you explain how does suffix proc work? I understand that the word is reduced to its stripped form based on minwordlen and suffix list. is that the word you enter as search word, or all words indexed?
What logic will help me decide what to include in the suffix list? In other words why wpuld adding "se" work?
Thanks.
mark
Site Admin
Posts: 5519 Joined: Tue Apr 25, 2000 6:56 pm
Post
by mark » Wed Feb 13, 2008 11:50 am
Suffix rules apply to both query terms and text terms.
defsuffrm=0
minwordlen 4, sufs s es:
horses -> hors
horse -> horse (defsuffrm=1 -> hors)
minwordlen 3, sufs s es:
horses -> hors -> hor
horse -> horse (defsuffrm=1 -> hors)
minwordlen 4, sufs s es se:
horses -> hors
horse -> horse (defsuffrm=1 -> hors)
minwordlen 3, sufs s es se:
horses -> hors -> hor
horse -> hor