French accents normalization

edev
Posts: 127
Joined: Wed Sep 14, 2005 5:10 pm

French accents normalization

Post by edev »

Thanks Mark. I changed it to

<a name=normword hit>
<local fixed>
<sandr "\xE9" "e" $ret>

<$normword=$ret>
<fmt "(%s,%s)" $hit $normword>

</a>

for a test and it doesn't seem to be picking up the query at all. I've also tried to write a custom user eqvsusr.lst with backref.exe, it ran fine in DOS command window, but when I include the list in getapisettings function it doesn't pick up the equivalent words. I even used a "~" in front of the word it still doesn't work.

<a name=getapisettings public>
<local sets apis>
<$sets=SSc_alequivs SSc_alintersects SSc_allinear SSc_alpostproc SSc_alwithin SSc_exactphrase SSc_keepnoise SSc_alnot SSc_alwild>
<$apis=alequivs alintersects allinear alpostproc alwithin exactphrase keepnoise alnot alwild>

<loop $sets $apis>
<getvar $sets>
<switch $ret>
<case Y><apicp $apis on>
<case N><apicp $apis off>
</switch>
</loop>
<apicp ueqprefix "C:\Program Files\Thunderstone Software\Webinator\eqvsusr">
</a>

Any idea why? Thanks!
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

French accents normalization

Post by mark »

Your sandr's should be
<sandr "\xE9" "t" $hit>
<sandr "\xC9" "\x45" $ret>
<sandr "\xEA" "\x65" $ret>
etc.
<$normword=$ret>

For equivs you also need "keepeqvs" on.
edev
Posts: 127
Joined: Wed Sep 14, 2005 5:10 pm

French accents normalization

Post by edev »

Hi Mark, I made the changes, but I still get exactly the same results as before. Instead of (montréal,montreal) I'm still getting (montréal,montréal).


<a name=fixword hit>
<local fixed>
<sandr "\xC9" "E" $hit>
<sandr "\xEA" "\x65" $ret>
<$fixword=$ret>
<fmt "(%s,%s)" $hit $fixword>
fix word is $fixword
</a>

<a name=fixquery sq>
<fmtcp sandcall noesc "[^\space,\x22()]*>>[\x80-\xFF]=[^\space,\x22]*" fixword>
<capture><sb>$sq</sb></capture>
<$sq = $ret>
<html>the query is $sq</html>

</a>


<A NAME=main public>
<top><!-- top of page boilerplate -->
<fixquery sq=$&query>
<search>
...
</A>

The "keepeqvs" is set to 1, so it has always been on.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

French accents normalization

Post by mark »

You don't have any sandr for é. You need
<sandr "\xC3\xA9" "e" $ret> and/or
<sandr "\xE9" "e" $ret>
edev
Posts: 127
Joined: Wed Sep 14, 2005 5:10 pm

French accents normalization

Post by edev »

It works now, thank you!
Post Reply