Page 1 of 1

Increasing servers - Single Pages

Posted: Tue Dec 18, 2012 8:37 am
by kiddywood
Hi,

I'm crawling a large list of single URL's using either the single page option or base URL with maximum depth at 0.

To get maximum speed, what is a safe number I could increase the 'server' option to?

Increasing servers - Single Pages

Posted: Tue Dec 18, 2012 10:15 am
by mark
Singles will be processed serially regardless. For multiple base URLs the best number will vary greatly depending on your system speed, memory, disk, network bandwidth, and somewhat depending on the file types. You probably wouldn't want to go over 20 servers though.

Increasing servers - Single Pages

Posted: Wed Dec 19, 2012 5:15 am
by kiddywood
Thanks Mark.

Also, regarding speed, when I attempt to crawl a large number of single pages (30,000), I find that the walk slows and slows to the point where it's only indexing a few pages an hour towards the end.

Can you advise please?

Increasing servers - Single Pages

Posted: Wed Dec 19, 2012 7:51 am
by mark
Do you have current scripts? Version 6.1.5 is on the download page.

Increasing servers - Single Pages

Posted: Wed Dec 19, 2012 9:00 am
by John
If you have a lot of Base URLs you may also want to turn off the "Follow Cross Site Links" setting.

Increasing servers - Single Pages

Posted: Fri Dec 21, 2012 1:37 pm
by kiddywood
I've followed both of your instructions as per above - the walk definitely speeded up, however, it resulted in a failed run after a couple of hours:

Walker stopping by request. (singles)
Dispatcher exiting.
Cancelled by user: 2012-12-17 21:35:07

Nothing was cancelled by me so why did it stop with this message?

Increasing servers - Single Pages

Posted: Fri Dec 21, 2012 1:56 pm
by mark
If you have old software stopping any walk will stop them all.

Increasing servers - Single Pages

Posted: Sat Dec 22, 2012 4:24 am
by kiddywood
Mark, this is the only walk I'm running.

It's stopping half way through as above.