indexing large documents

Post Reply
User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

indexing large documents

Post by Thunderstone »




Can Webinator index large (hundred pages or so) html documents so that it will
return sections of the document in the results.

Our problem is that we have somewhat large documents (>100 pages), and would
like the search engine to return links to all of the matches within a single
document. As it is now, it returns a link only to the entire document. Is
there a way around this?

Thanks in advance

Eddie Weigle

____________________________________________________________________
Get your own FREE, personal Netscape WebMail account today at http://webmail.netscape.com.



User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

indexing large documents

Post by Thunderstone »



In the "context" function, you could display a sample of text around each hit
using mminfo.

Replace:

<sql max=1 "select * from html where id = $id"></sql>

with something like:

<apicp alwithin 1>
<apicp alintersects 1>
<sum "%s" $query " @0 w/100">
<sql max=1 "select Url,Depth,Visited,Title,Meta,mminfo($ret,Body,0,0,1) Body from html where id = $id"></sql>

See the manual for details on mminfo and within processing (w/).
You might want to do something like this to the Body before displaying it:

<$s="^301 End of Metamorph hit$" "300 <Data from Texis>=[^\x0a]+">
<$r="\x0a\x0a" "\x0a\x0a --- \x0a\x0a">
<sandr $s $r $Body>




Post Reply