Simultaneous walks

Post Reply
scott.shaver
Posts: 45
Joined: Tue May 31, 2005 12:13 pm

Simultaneous walks

Post by scott.shaver »

Is there a recommended limit for the number of walks that can run at the same time? I ran 19 walks at the same time against the same web site, each using a different user. The first few walks seemed to work but the remainder of them stopped after finding only a few pages when they should have found around 3,000. Some got 40 some got 300.

If I rerun them one at a time they work (with rewalk set to new not refresh).

The walks that screwed up get a lot of this kind of thing:

The link : https://partners.mcdata.com/prm/partner ... urce=index
Is a duplicate of: https://partners.mcdata.com/prm/dispatcher/viewed_legal
Referenced by : https://partners.mcdata.com/prm/partner/index.jsp
https://partners.mcdata.com/prm/dispatcher/viewed_legal
https://partners.mcdata.com/prm/partner ... urce=index
https://partners.mcdata.com/prm/partner ... urce=index
https://partners.mcdata.com/prm/partner ... objectname
https://partners.mcdata.com/prm/partner ... s_list.jsp
https://partners.mcdata.com/prm/partner ... ner_promos
https://partners.mcdata.com/prm/partner ... ner_promos
https://partners.mcdata.com/prm/partner ... urce=index

The link : https://partners.mcdata.com/prm/getProt ... _flyer.pdf
Is a duplicate of: https://partners.mcdata.com/downloads/m ... _flyer.pdf
Referenced by : https://partners.mcdata.com/prm/dispatcher/viewed_legal
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Simultaneous walks

Post by mark »

For simultaneous walks "Max process size" should be set no larger than "Large" so as to not run the system out of memory. If there are a lot of walks running you might want "Medium" or "Small". Otherwise it should work ok as long as the walked server tolerates it. Some servers will begin to block crawlers that they deem abusive. Maybe that's what happened to you.
Post Reply