Page 1 of 1

Fixing abstract

Posted: Fri Sep 17, 1999 2:26 pm
by Thunderstone



We have the problem that some of our search descriptions begin with product
names that look like this: "GIFmation". So, in our search results, we want
something that looks like this:

1. Some Page About GIFmation
GIFmation is the only professional quality blah blah ...
http://blahblahblah.

The problem is that what <abstract ...> actually returns is "Fmation is the
only professional". It ignores the two leading capital letters. As a result,
in our search results, we actually end up displaying somethin like this:

1. Some Page About GIFmation
Fmation is the only professional quality blah blah ...
http://blahblahblah.

Ideally, <abstract> would just not ignore the two leading capital letters.
Since I don't have any way of fixing that, I've written an alternate
function, <vabstract>.

<!--
vabstract
Allow an abstract to begin with a sequence of capital letters.
The normal abstract sees a paragraph that begins "GIFmation is..."
and turns it into "Fmation is..." We need to check for that case
and fix it.

created: ads, 16 Sep 1999
-->

<a name=vabstract text>
<local original expression>

<!-- get the vortex abstract -->
<abstract $text 180>
<$original = $ret>

<!--
compose a regular expression that consists of the original
abstract, all literal characters, immediately preceeded
by any number of capital letters.
-->

<!-- escape all special characters and prepare the search expression
-->

<sandr
"[\X21\X24\X2A\X2B\X2C\X2D\X2E\X3D\X3E\X5B\X5C\X5D\X5E\X7B\X7D]="
"\X5C\X5C\1" $original>
<sum "%s" "[A-Z]*>>" $ret "=">
<$expression = $ret>

<!-- find the extended abstract, including leading capital letters.
-->

<rex $expression $Body>
<return $ret>

</local>
</a>

This function doesn't seem to work. It is apparently correctly constructing
$expression, because I can <send $expression>, copy it from the source in
the browser, paste it into the rex function, and it works fine. But for some
reason when I pass it as a variable, it ceases to work.

My first question is why doesn't it work?

My second question is this: I've considered an alternate solution, where I
actually delimit the descriptive text with some delimiter in the html, and
pull it out using <rex> instead of <abstract>, when delivering the search
results. Unfortunately, the obvious delimiter is an html comment, which is
not stored in the database $Body field. Is there some way of storing a
delimiter in the $Body field so I can use it to pull out the descriptive
text? (Without having the delimiters visible to human readers.)

Thanks for any input.

Aaron




Fixing abstract

Posted: Fri Sep 17, 1999 5:59 pm
by Thunderstone




You only need one \X5C in your replacement string, not two.
<sandr
"[\X21\X24\X2A\X2B\X2C\X2D\X2E\X3D\X3E\X5B\X5C\X5D\X5E\X7B\X7D]="
"\X5C\1" $original>


Another alternative would be to use META tags, and then select the Meta
field into $Meta when you do the search.

John Turnbull
-------------
Thunderstone Software