Page 1 of 1

Query markup with wildcards and accents

Posted: Wed Apr 03, 2013 10:44 am
by josmani
Hi Guys,

I am using "%mbs" for query markup on unicode content. When using wildcards in query ie niafounk* the highlighting breaks the unicode character at the end and Niafounké becomes <b>Niafounk&#65533;</b>&#65533;

Would appreciate any ideas for fixing this.

Query markup with wildcards and accents

Posted: Wed Apr 03, 2013 12:10 pm
by Kai
Highlighting uses linear Metamorph search, and the latter, when expanding a wildcard hit, uses the wordc SQL setting to define what a word is. That is, the asterisk will match characters adjacent to the root/prefix as long as they are in the wordc REX character class. By default, wordc includes only ASCII alphabetic characters and single-quote -- not hi-bit UTF-8. Add UTF-8 chars with this setting:

<$wordc = "[\alpha\x80-\xff\']">
<sql novars "set wordc=$wordc"></sql>

before doing the highlighting. (Webinator's default Language Characters already includes this, for those using Webinator.)

Query markup with wildcards and accents

Posted: Wed Apr 03, 2013 12:53 pm
by josmani
It worked many thanks