Query markup with wildcards and accents

Post Reply
josmani
Posts: 53
Joined: Tue Jun 03, 2003 3:38 am

Query markup with wildcards and accents

Post by josmani »

Hi Guys,

I am using "%mbs" for query markup on unicode content. When using wildcards in query ie niafounk* the highlighting breaks the unicode character at the end and Niafounké becomes <b>Niafounk&#65533;</b>&#65533;

Would appreciate any ideas for fixing this.
User avatar
Kai
Site Admin
Posts: 1271
Joined: Tue Apr 25, 2000 1:27 pm

Query markup with wildcards and accents

Post by Kai »

Highlighting uses linear Metamorph search, and the latter, when expanding a wildcard hit, uses the wordc SQL setting to define what a word is. That is, the asterisk will match characters adjacent to the root/prefix as long as they are in the wordc REX character class. By default, wordc includes only ASCII alphabetic characters and single-quote -- not hi-bit UTF-8. Add UTF-8 chars with this setting:

<$wordc = "[\alpha\x80-\xff\']">
<sql novars "set wordc=$wordc"></sql>

before doing the highlighting. (Webinator's default Language Characters already includes this, for those using Webinator.)
josmani
Posts: 53
Joined: Tue Jun 03, 2003 3:38 am

Query markup with wildcards and accents

Post by josmani »

It worked many thanks
Post Reply