Page 2 of 2

Webinator just...stops

Posted: Thu Dec 04, 2003 11:54 am
by b.sims
I can view the children and page2 is there but not a hyperlink; no reason is given, even though verbosity is 5. The walk appears to have completed normally to depth 4.

The only things in the Vortex log around that time are:

000 Dec 4 12:54:03 D:\Web-Root\webinator\dowalk:2108: Vortex (2052) ABEND: exception 0xC0000005 (ACCESS_VIOLATION)
100 Dec 4 12:54:03 D:\Web-Root\webinator\dowalk:2108: Max page size exceeded (truncated) for http://ioc.unesco.org/igospartners/IGOS ... July02.pdf

Neither of which seems to explain the problem.

Webinator just...stops

Posted: Thu Dec 04, 2003 12:56 pm
by mark
Actually, if 2052 is the process reported in the walk status (eg "started # (2052) on http:...) then that *is* the problem. What is 2108 in your dowalk script? If it's <fetch> it may have found a page it can't deal with. What's your texis version (texis -version)?

Webinator just...stops

Posted: Thu Dec 04, 2003 2:36 pm
by b.sims
I was hoping you might say that...

2108 is indeed a fetch:

<fetch parallel=$SSc_maxthreads $u><!-- get everything at this depth -->

You think that a single page is causing the walk to stop all other pages?

Version is:

Commercial Version 4.03.1054748145 of Jun 4, 2003 (i686-intel-winnt-64-32)

Webinator just...stops

Posted: Thu Dec 04, 2003 3:00 pm
by mark
That ABEND indicates an internal program problem at which point that vortex process bails out, a fairly unusual severity of error. So whatever site that sub-process was working on will likely be left unfinished.

In your dataspace for that profile you should find cururls.2052 indicating what urls were being worked on by that process. Please open a tech support ticket using the Tech Support link on the left menu here and provide that file. Also include anything else that might be needed to replicate the problem: non-default settings, webinator scripts version.

If you set threads to 1 in the profile and then run that walk from a command prompt and set ttyverbose=1 you can see the last url processed before it bailed out. Then look in the cururls file to find the one after that. See http://www.thunderstone.com/texis/site/ ... ing+dowalk

Webinator just...stops

Posted: Thu Dec 04, 2003 3:13 pm
by b.sims
I just excluded the directory containing the file mentioned above, and the walk runs through to the expected level. So, that seems to confirm that certain files are causing the walk process to die.

I will do as you suggest to find out why that particular file is upsetting the walk, and until then run excluding that directory.

Thanks for your help.