Question on transversing URL links

User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Question on transversing URL links

Post by Thunderstone »



Hello all!

Can anyone tell me exactly how Webinator transverses URL links from a
starting HTML file? My problem is that my entire site is not being
indexed properly. Let me detail what I am talking about by explaining
how my links are set up:

index.html
|
| has a link pointing to
| ossmap2.htm (same directory)
v
ossmap2.htm
|
| has a link pointing to
| news_body.htm (same directory)
v
news_body.htm
| has a link pointing to
| news_toc.htm (same directory)
|
v
news_toc.htm
|
| has a link pointing to
| /UX-SE/News/News2/maillist.html
| that is 3 subdirectories below
| the web server root directory.
| The link is part of an
| unordered list.
v
/UX-SE/News/News2/maillist.html
|
| has numerous links which point
| to about 170 HTML files within
| the local /UX-SE/News/News2/
| subdirectory. The syntax is:
| HREF="msg00163.html". However,
| None of these HTML files are
| indexed by Webinator.
v
msg00163.html


Doesn't Webinator transverse links in this manner? My entire web site
is accessable from the root index.html file by clicking on links to the
files I want to access. It seems to me that Webinator should be able to
follow these links. If I am mistaken of have things set up wrong, could
someone please let me know? Here is my Webinator syntax for generating
my database from a batch file:


rem OSSWEB2.BAT
rem Start creation of webinator database.
cd d:\netscape\server\docs\webinator\bin
gw -wipe -dD:\NETSCAPE\SERVER\DOCS\WEBINATOR\DB
gw -v4 -b -z300000 -dD:\NETSCAPE\SERVER\DOCS\WEBINATOR\DB -lOSSWEB.LOG
"&url.lst"
gw -index -dDB -lOSSWEB.LOG


Here is the contents for my "url.lst" file:
http://oss.az.stratus.com/
http://oss.az.stratus.com/antares/
http://oss.az.stratus.com/antares/www/


Any feedback would be appreciated. Thanks!

Brett Lackey
Brett_Lackey@oss.az.stratus.com
User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Question on transversing URL links

Post by Thunderstone »



..
..

Webinator should walk every page on a single host referenced by that host
as you describe. Any directory references should end in a slash (/).
You have verbosity up to 4. If you turn it up to 8 (-v8) gw will print
every link it sees and whether it wants to walk it or not and why.
That should help you identify your problem.

I can't reach the above url to see what your html looks like.