How to limit number of "children" links for each URL

edev
Posts: 127
Joined: Wed Sep 14, 2005 5:10 pm

How to limit number of "children" links for each URL

Post by edev »

Hi Mark,

I did as your post above, from line 3323 in dowalk 5.1.54, I added the code:

<if $SSc_maxpages ge 0><!-- there is a page count limit -->
<if $mynpages gte 100>
Site max pages of 100 reached.
<sql "update counts set npages=npages-1,nbytes=nbytes-$pagesize">
</sql>
<$noMoreNew=1>
<return>
</if>
<if $npages gt $SSc_maxpages><!-- hit page count limit -->
Maxpages of $SSc_maxpages reached.
<sql "update counts set npages=npages-1,nbytes=nbytes-$pagesize">
</sql>
<$noMoreNew=1>
<return>
</if>
</if>

But when I run the walk and look at the log, it gives me the "Site max pages of 100 reached" error message but it keeps going. I only have one URL in the base URL for testing:

2006-12-28 10:53:31 started 1 new (1120) on http://www.collectionscanada.ca
Site max pages of 100 reached.
100 pages fetched (2,334,935 bytes) from http://www.collectionscanada.ca
2006-12-28 10:54:00 started 1 (5668) Resume 4593e87b18
Site max pages of 100 reached.
100 pages fetched (942,946 bytes) from http://www.collectionscanada.ca

The walk does not stop at 100 pages for one single URL. Is there anywhere else I should put this code?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

How to limit number of "children" links for each URL

Post by mark »

Add
<xtree flush todo>
after the <$noMoreNew=1> lines so it doesn't save resume state.
edev
Posts: 127
Joined: Wed Sep 14, 2005 5:10 pm

How to limit number of "children" links for each URL

Post by edev »

Thank you Mark...I will try it and let you know!
Post Reply