Page 1 of 2

Walk stops early

Posted: Tue Feb 10, 2009 3:01 pm
by kiddywood
Hi,

I'm trying to walk a list of around 18000 URL's using 'single page' but the walk stops at around 2120. I have checked to see if there are any URLs that could be causing the walk to stop but there is nothing obvious. It just stops and says it has completed the walk - there doesn't appear to be any errors in the walk status suggesting why it has stopped it just finishes as though there were only 2120 URLs in my list.

Are there any particular details in a URL that could cause this to happen such as certain extensions etc?

I also find that when I use a refresh crawl it follows links and indexes those pages even though I am using the single page method. Any idea why this is happening?

Walk stops early

Posted: Tue Feb 10, 2009 4:30 pm
by mark
Check the vortex.log to see if there's an error there.

For the refresh of singles you need to download the latest scripts from the website.

Walk stops early

Posted: Tue Feb 10, 2009 4:47 pm
by kiddywood
My Vortex.log only goes up to 28 Nov 2008. There is no information relating to any recent walks.

Walk stops early

Posted: Tue Feb 10, 2009 5:52 pm
by mark
Generate an intentional error to ensure that logging is working. Like accessing ".../qwerty" instead of ".../dowalk". If the error doesn't get logged to vortex.log check it's perms etc.

Download the latest scripts and see if the problem persists.

Walk stops early

Posted: Thu Feb 12, 2009 10:05 am
by kiddywood
I've managed to open Vortex.log and it is working properly.


The latest walk has stopped after just 1129 pages and it has this message on vortex.log

Vortex (73588) ABEND: exception 0xC0000005 (ACCESS_VIOLATION) at ip 0x1002071d TID:0x0001A6E8: processing URL http://www.adt.co.uk/new-vacancies.html at line 822 byte 55 running JavaScript at line 692

Walk stops early

Posted: Thu Feb 12, 2009 11:04 am
by mark
What's your Texis version? From a command prompt run

INSTALLDIR\texis.exe -version

and paste the result back here.

Walk stops early

Posted: Thu Feb 12, 2009 11:17 am
by kiddywood
C:\Program Files\thunderstone software>texis.exe -version
Texis Web Script (Vortex) Copyright (c) 1996-2008 Thunderstone - EPI, Inc.
Webinator Professional Version 5.01.1218161949 20080808 (i686-intel-winnt-64-32)

Walk stops early

Posted: Fri Feb 13, 2009 11:34 am
by mark
I can't replicate that with just that url. Maybe it's setting related or something about earlier urls. Try setting Parallelism:Threads to 1. If you don't need javascript processing you could try turning that off.
If you remove that one url or move it to beginning of the list does it make a difference?

Walk stops early

Posted: Fri Feb 13, 2009 11:49 am
by Kai
Open a tech support ticket (menu at top) with a copy of your texis -version and the ABEND message, and we'll continue to investigate.

Walk stops early

Posted: Fri Feb 13, 2009 12:25 pm
by kiddywood
I've tried starting another walk and it now ends on only 5 URLs.

I am getting these error messages:

Cannot create directory C:\Program Files: Cannot create a file when that file already exists

User PUBLIC has been added without a password.

Vortex (108224) ABEND: exception 0xC0000005 (ACCESS_VIOLATION) at ip 0x1001e89a processing

Unable to determine free space. Will proceed assuming there is enough. in the function create index