Combining rewalk with URL list file

Post Reply
User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Combining rewalk with URL list file

Post by Thunderstone »



I have a very specific list of HTML pages to "webinate". :)
So I'm using a URL list. That list changes from time to
time so I'm wanting to use -rewalk when it changes:
gw -rewalk "&url.lst"

But the previous image of the URL list must be hanging out
there someplace because only the pages listed in the original
version of my url.lst file is webinated. (I double-checked and
I'm positive I don't have an old version of the URL list file
on the system - SGI-IRIX - someplace.) It's as if -rewalk causes
gw not to re-examine the URL list file...?


-John Koch - - - __o
Knowledge Systems, Inc. - - - - _ \<,_
<John.ksi@webplus.net> - - (_)/ (_)
(A NET-FRIENDLY SIG. http://www.ncsa.uiuc.edu/Edu/ICG/pt1.ch2.Etiquette.html )



User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Combining rewalk with URL list file

Post by Thunderstone »



John.ksi@webplus.net said:

The rewalk option is designed to rewalk the database exactly the
way it was initially walked, including the URLs. If you want to
do a different walk you will need to start gw into a new database.
It might be better to see if there is a way you can specify the
URLs you want walked more generically, so new file will automatically
be picked up, and deleted files will be removed.


User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Combining rewalk with URL list file

Post by Thunderstone »




By "exactly the way it was initially walked", does this include
all gw options originally specified (such as -n for plug-ins)?


I wish it were so. These html files are old newsletters which are
moved into a different directory on the web server when they're old
enough. None of these files are linked to from on any other files -
including themselves. So I use a URL list and the -D0 option. I'm not
thrilled at having to use a URL list file, but I'm sure glad the
functionality is there! I'm open to any ideas on how to accomplish
the above suggestion, tho'.

-John



User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Combining rewalk with URL list file

Post by Thunderstone »



John.ksi@webplus.net said:

Yes.


What I would do in that case would be to use an HTML file as your URL
list, with an HREF for each newsletter, and point gw at that with a
-D1 option. I am assuming that you don't want to enable directory
indexing on that directory.


Post Reply