Turn auto-HTML escaping OFF?

Crash.Alpha
Posts: 9
Joined: Sun Sep 23, 2001 1:26 pm

Turn auto-HTML escaping OFF?

Post by Crash.Alpha »

NEED HELP QUICK!

We are walking our site on a modified version of the demo dowalk script. All page indexing is done on content that first runs through a special "accent stripping" proc we wrote. We want to save the original HTML for the title and a "description" meta-tag in seperate fields for display purposes. This allows us to do accent-stripped searches, but allows us to display the title and description with the original accents intact.

The dowalk logic keeps the raw HTML from the "description" meta tag, but "urlinfo title" does URL escaping. This is causing problems which I will assume have to do with the difference between the codepages used by the server doing the indexing, and the PC displaying the results.

In fact, in our multi-lingual environment (Swiss - imagine the character sets we have to support between English, French, German and Italian) HTML auto-unescaping has caused all kinds of grief. Can we:

1) turn it OFF all the time?

2) turn it OFF for just urlinfo, urltext, etc?

I know everone needs an answer ASAP, but... well QUICK help may put you on my xmas gift list... like, I could post my accent-stripping proc to anyone who needs it...

Carlo
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Turn auto-HTML escaping OFF?

Post by mark »

You're probably seeing the escapement on display, not extraction or storage. To send html literally use <send $yourvariable> or <fmt "%s" $yourvariable>
Crash.Alpha
Posts: 9
Joined: Sun Sep 23, 2001 1:26 pm

Turn auto-HTML escaping OFF?

Post by Crash.Alpha »

Using your example, I tried this:

<urlinfo title>
<fmt "%s" $ret>

The result on the output was the HTML unescaped text. Is texis.exe unescaping the output? I am unclear what would be stored to a var called $title if I did this:

<urlinfo title>
<strfmt "%s" $ret><$title = $ret>

According to your example, it should be the original html - but it appears to be the unescaped HTML. I am looking at the results with tsql - it is showing the unescaped characters with the Title field, but the Description field (which uses the meta tag extraction code) APPEARS to be storing HTML escaped characters.

It really looks like urlinfo is doing the unescaping for me.