site map from webinator database? (how to do it)

Post Reply
User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

site map from webinator database? (how to do it)

Post by Thunderstone »




It is possible to generate a site map from a complicated join
between the HTML table and the REFS table.

Here's a quickie way to do see this.
CD into a Webinator database directory and type the following gw command.
(remove the line feeds, they're just there for clarity)


gw -s "select dad.Depth,dad.Url,refs.Ref,dad.Title,son.Title
from html dad,refs,html son
where refs.Url=dad.Url
and son.Url=refs.Ref
and son.Depth>dad.Depth"


It'll produce a report looking like this:

Depth: 0
Url: www.thunderstone.com/
Ref: www.thunderstone.com/jump/Company.html
dad.Title: Thunderstone Home Page
son.Title: Thunderstone Background and Customers

Depth: 0
Url: www.thunderstone.com/
Ref: www.thunderstone.com/jump/Contact.html
dad.Title: Thunderstone Home Page
son.Title: Thunderstone Phone and Address Information
.
.
.

Turning this query into a little Texis Web Script allows you to create an
html table of contents. The output of the script below
can be seen at http://www.thunderstone.com/webinator/toc.html

Here's the script: (a discussion follows the script)
--------------------------------------------------------------------------
<script language=vortex>
<db=/htdocs/webinator/db>
<timeout=-1></timeout>

<a name=main>
<title>Table of Contents</title>
<$Depth=0>
<$Url="www.thunderstone.com/">
<toc>
</a>

<a name=toc>
<ul>
<sql ROW "select son.Title Title,son.Url Url,son.Depth Depth
from html dad,refs,html son
where dad.Url=$Url
and refs.Url=$Url
and son.Url=refs.Ref
and son.Depth>$Depth
">
<xtree search $Url>
<if $ret eq "">
<xtree insert $Url>
<li><a href=http://$Url>$Title</a>
<toc>
</if>
</sql>
</ul>
</a>
</script>
------------------------------------------------------------------------------


Basically we recursively descend the html tree using the refs table
to tell us where to go next.

The html table is aliased so that we can do a relation unto itself
in the join:

Parent data in html -> child_links in refs -> child data in html


The <xtree> calls are there to prevent any URL from being shown twice.

With a little putzing this script could be turned into a pretty useful tool.
To run it at your site just edit the <db> name and base Url in <main>.

then type: ~cgi-bin/texis myscript >mytoc.html


Good Luck,

Bart

OH YEah: dont forget to run "gw -index" before you play with this script.


Post Reply