Walk Dies - related to Process Size

User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Walk Dies - related to Process Size

Post by mark »

That's why the 15 minute schedule, so it'll resume by itself if it runs out of mem.
pete.smith
Posts: 73
Joined: Tue May 17, 2005 2:08 pm

Walk Dies - related to Process Size

Post by pete.smith »

I set it up to unlimited, but my todo is 200K+.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Walk Dies - related to Process Size

Post by mark »

If you set it to unlimited you need to make sure your system doesn't run out of memory or bad things could happen to the database. Also above you said it died when using unlimted. What did the status say in that case because it won't quit unless you tell it to or the system runs out of ram or it thinks it's finished?

The main question is how much progress is it making on the total pages walked?

What are your texis and webinator scripts versions? (get the latest from the website if you don't have them already)
What are your non-default settings, including base urls?
pete.smith
Posts: 73
Joined: Tue May 17, 2005 2:08 pm

Walk Dies - related to Process Size

Post by pete.smith »

It doesnt really stop, just crawl so slowly with the todo never moving, 210,555. It really makes no progress.

Texis and Webinator :
Webinator 5.1.18-Unix-w/plugin

Latest version of dowalk script.

It does look like bad things happened to the database on unlimited:

Last complete walk: 2005-08-15 18:00:01 002 /rel/www/internal/internal_search/thunderstone5/morph/texis/scripts/webinator/dowalk(dispatch) 5117: can't open /rel/www/internal/internal_search/thunderstone5/morph/texis/InternalSearchProd/db1/: no SYSTABLES in the function ddopen 000 /rel/www/internal/internal_search/thunderstone5/morph/texis/scripts/webinator/dowalk(dispatch) 5117: Could not connect to /rel/www/internal/internal_search/thunderstone5/morph/texis/InternalSearchProd/db1 in the function openntexis (took seconds)
pete.smith
Posts: 73
Joined: Tue May 17, 2005 2:08 pm

Walk Dies - related to Process Size

Post by pete.smith »

Mark also, this is Pete Smith I attended the conference you saw none of my settings were far from default, there is only one base url and a few exclusions.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Walk Dies - related to Process Size

Post by mark »

I wasn't at the conference.

Try starting fresh with a new profile. Delete that one or make a copy of it to use. Set servers to 1, max process size to huge, the schedule to every 15 minutes, rewalk type to new, then start the walk.

Note that todo is not reflective of walk speed. It will vary up and down as new links are found and walked. Watch the total pages count for walk progress.
pete.smith
Posts: 73
Joined: Tue May 17, 2005 2:08 pm

Walk Dies - related to Process Size

Post by pete.smith »

I just did that. So if it schedules every 15 mins, wont it over write the prev? Doesnt new zero out? I guess in general my interpretation of how this works isnt right.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Walk Dies - related to Process Size

Post by mark »

Yes, normally new walk would start clean. But all scheduled walks are refresh regardless of type setting. Though once the first new walk stops it's less confusing to switch the type to refresh.
pete.smith
Posts: 73
Joined: Tue May 17, 2005 2:08 pm

Walk Dies - related to Process Size

Post by pete.smith »

So the negative of this is, I wont really get a fast turnaround like I did on my other server, which in 24 hours did 560K documents. What you are saying is, that due to memory ceilings I will have to do smaller refresh walks and just wait till it does the whole thing. Can I increase the ceiling somehow on a system level? Why was my old linux box so much better seemingly even though this solaris box is HUGE? Anything I can suggest to the admins?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Walk Dies - related to Process Size

Post by mark »

As far as the memory usage it would seem that the memory management libraries in Linux are better than those in Solaris. So it'll have to restart more often. But I don't know how much that's related to the speed problem. There's a little more cpu required for managing more memory but I wouldn't expect it to be that much. You can edit the dowalk script to change the limit to whatever you want. You can change the 700000000 to something else. Make sure you don't drive the system into swapping though. Then performance would be guaranteed to be slow.

There are many factors besides cpu and memory.
Is the new server on the same network as the old one?
Is the routing and dns on it the same?
What about disk I/O? Is it as fast or faster than the old system?

This thread hasn't really talked about walk rates. Only stopping and resuming issues. What page/minute rates are you seeing on the 2 systems? What are the texis versions (texis -version) on the 2 systems? What are the os versions (uname -a)?
Post Reply