using like with foreign chars

Post Reply
basement_addict
Posts: 78
Joined: Mon Nov 19, 2001 5:20 pm

using like with foreign chars

Post by basement_addict »

With English LIKE searches 'san diego' matches 'SAN DIEGO' , 'SaN DieGo' etc.

We are running into big problems with foreign characters and LIKE

example polish city:

ŁOMIANKI

when doing a LIKE search on this city it only returns cities with the capital Ł

It would be nice if łomianki was also returned but each character is represented differently.

Upper Ł: 0xC5 0x81

Lower ł: 0xC5 0x82


So as a hack I did some sandr statements and would turn a word like branżą to bran*
for the most part this works except when the char is the first char like: *OMIANKI (alpostproc is unacceptable in our jobsearch)

I'm curious to see what other international search engines have done and if you have a solution.
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

using like with foreign chars

Post by John »

If you have a table with a single foreign language that has an single byte encoding and locale that knows it you can use that.

With set qminprelen = 0 and set wildsingle = 1 you should be able to do the wild card query with a linear scan of the dictionary, not all the data.

Otherwise you could force data and query to lower case outside Texis.
John Turnbull
Thunderstone Software
Post Reply