We are using the Free Webinator 2.1 to index sites, and using the -p40
parameter to limit how many pages are indexed to 40, in cases where a site
may have too many links. (We are also using time limitations, and the -j
parm to try and keep it within one directory.)
I watch the gw walker run, and sure enough it stops processing pages at 40.
HOWEVER, all the while it is adding huge numbers of pages to the TODO
list. This slows down processing, which I don't really care about.
The real problem is this: When the fetching is done, the next command
indexes the database. At this point, webinator goes back into the the TODO
list, and starts fetching the hundreds or thousands of pages listed there.
Is gw supposed to be smart enough to know that 40 pages means 40 pages, not
"40 pages now, and everything else you can see in a few seconds"?
I don't see any parameter I can use (including -a) that will tell gw to
limit what it puts in the TODO list to x pages, OR a parm that tells it to
empty its TODO list.
What gives?
Scott Cochran
------- Distributed Internet Applications --------
Online Development www.ondev.com
13555 Automobile Blvd #350
Clearwater, FL 34622 (813) 556-0120