I want to convert characters with diacritical marks to their closest keyboard equivalents, so that searches without the diacritics will find the information. However, what I thought would work isn't working. For example, if I try to convert é to e in string $str, I use
<sandr "\x82" "e" $str>
but $ret is the same as $str. My ASCII chart lists 82 as the hex value for é. Is there something wrong with my method, or is there a mismatch between my ASCII chart (http://www.cdrummond.qc.ca/cegep/inform ... /ascii.htm) and Texis's?
I think I may have found it. I realized that I was using the PC-DOS extended ASCII set, but our system is Solaris. I think I've found what I need in /usr/pub/iso, so I'll try that.
This <fetch> with a 2nd argument will use $str as the raw HTML, instead of actually fetching it from http://localhost/foo.html (it is important that that URL end in .html nonetheless). <urltext> will return the "formatted" text, with 8-bit chars translated to 7-bit equivs (because of the <urlcp>). (The text will also be word-wrapped, if it's longer than 80 chars, and any HTML sequences/tags decoded.)