Crawling Speed

User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Crawling Speed

Post by mark »

The status only shows the recent state. There's no log of statuses.
rmehrotra
Posts: 17
Joined: Thu Jul 28, 2005 3:12 pm

Crawling Speed

Post by rmehrotra »

Mark, On the status log there are following data present most of the time #1 & #4 are same and #2,#3 keep on changing. Could u explain what is the signifcance of each of them:
1- 3,564 pages in todo
2- 10,186 pages scheduled for the next hour
3- 1,265 pages visited in the last hour
4- 42,962 pages total

also in last 5 hr only 1300 pages are indexed and this I am telling you based on the value #4. Which is too slow if I think about 500K paegs are there. Are there any standard ways to expedite the process. On the web every page takes almost 3-5 sec to load normally.

Thanks a lot for resolving my each doubt patiently.. let me know if there is some another ways to clear these doubts over phone or some support center.

-rm
rmehrotra
Posts: 17
Joined: Thu Jul 28, 2005 3:12 pm

Crawling Speed

Post by rmehrotra »

Mark,
Following is the log when process stops automatically:
------------------------------------------------------
Walk started at 2005-08-11 15:05:05 (by schedule)
JavaScript walking disabled
HTTPS walking disabled
Start fetching at
Ignore urls containing any of the following:
/cgi-bin/
~
started 1 (14138) Resume 42fb42783
started 2 (14139) Resume 42fb6a235
Process memory limit exceeded (current: 50,053,120, limit: 50,000,000)
348 pages fetched (26,910,421 bytes) from
Process memory limit exceeded (current: 52,813,824, limit: 50,000,000)
1181 pages fetched (28,642,594 bytes) from
44177 pages fetched (1,703,625,984 bytes) Total
6141 errors Total
421 duplicate pages Total

Removing commonality from fetched pages...
Updating search index ...Done.
Creating spell-checker dictionaries...Done.
Done.
Verifying usability of new walk.

Walk finished at 2005-08-11 16:27:28 (took 1 hours 2 minutes 36 seconds)
Keeping database live: /usr/local/morph3/texis/default.42cd490f3/db2
------------------------------------------------------

Is there anyway by whihc we can increase the limit.

-rm
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Crawling Speed

Post by mark »

Yes, go to all walk settings. Near the bottom you'll find maximum process size. Increase it.
Post Reply