Page 1 of 1

diacritical marks interaction with thesaurus

Posted: Mon Jun 04, 2012 1:07 pm
by rjshelq
Hi,

When a word appears in my custom thesaurus, the search fails to ignore diacritical marks.

For example, these words all refer to the same Persian word:

saki, sâkî , sākī, saqi, sâqî, sāqī

If I search for "saki" (and if saki is not in my custom thesaurus), then I do get search hits on saki, sâkî , sākī as expected, properly ignoring diacritical marks.

However, since I also want to get hits for the variation transliterated with a q (saqi) instead of a k (saki), I added saki,saqi to my custom thesaurus, expecting that a search for saki would then return hits for all variations such as saki, sâkî , sākī, saqi, sâqî, sāqī and any other diacritical marks.

Unfortunately, as soon as I added saki,saqi to the custom thesaurus, the searches stopped being insensitive to diacritical marks, and now if I search for "saki" I only get saki and saqi, but fail to get any hits on other variants with diacritical marks.

Is there a way for searches to maintain diacritical insensitivity for words which appear in my custom thesaurus?

diacritical marks interaction with thesaurus

Posted: Mon Jun 04, 2012 1:55 pm
by Kai
It is not currently possible, as textsearchmode (the diacritical-mark ignore/respect setting) is not yet implemented with the thesaurus, only straight single-term sets (no equivalences). This is planned to be fixed in a future release, but it is probably several months away.

diacritical marks interaction with thesaurus

Posted: Tue Apr 02, 2013 6:08 am
by josmani
I've just noticed we have the same issue on full Texis 6.01 with thesaurus.

Any workarounds you can suggest?