european characters

Post Reply
User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

european characters

Post by Thunderstone »




Hi,
I noticed that the search program has problems with some of the international
characters, in my case Turkish ones. The turkish alphabet is almost the same
as the English one except 3-4 character (same in German and French). The
character causing problems is: þ . I am not sure if you can see it correctly
or not..
Baris.

-




User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

european characters

Post by Thunderstone »



Make sure you're using an index expression (-k option) that includes
those letters. The default expression, \alnum{2,30} , excludes 8-bit
characters since they are usually non-text.

Use a customized -k option such as

-k"[\alnum\x80-\xFF]{2,30}"

before performing a walk. If you've already performed a walk unindex
your database, and re-index it with a -k option such as

gw -unindex
gw -k"[\alnum\x80-\xFF]{2,30}" -index

You should also make sure that your web server's locale is set correctly for
your language.




Post Reply