Walk stops early

kiddywood
Posts: 41
Joined: Tue Feb 10, 2009 2:49 pm

Walk stops early

Post by kiddywood »

Hi,

I'm trying to walk a list of around 18000 URL's using 'single page' but the walk stops at around 2120. I have checked to see if there are any URLs that could be causing the walk to stop but there is nothing obvious. It just stops and says it has completed the walk - there doesn't appear to be any errors in the walk status suggesting why it has stopped it just finishes as though there were only 2120 URLs in my list.

Are there any particular details in a URL that could cause this to happen such as certain extensions etc?

I also find that when I use a refresh crawl it follows links and indexes those pages even though I am using the single page method. Any idea why this is happening?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Walk stops early

Post by mark »

Check the vortex.log to see if there's an error there.

For the refresh of singles you need to download the latest scripts from the website.
kiddywood
Posts: 41
Joined: Tue Feb 10, 2009 2:49 pm

Walk stops early

Post by kiddywood »

My Vortex.log only goes up to 28 Nov 2008. There is no information relating to any recent walks.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Walk stops early

Post by mark »

Generate an intentional error to ensure that logging is working. Like accessing ".../qwerty" instead of ".../dowalk". If the error doesn't get logged to vortex.log check it's perms etc.

Download the latest scripts and see if the problem persists.
kiddywood
Posts: 41
Joined: Tue Feb 10, 2009 2:49 pm

Walk stops early

Post by kiddywood »

I've managed to open Vortex.log and it is working properly.


The latest walk has stopped after just 1129 pages and it has this message on vortex.log

Vortex (73588) ABEND: exception 0xC0000005 (ACCESS_VIOLATION) at ip 0x1002071d TID:0x0001A6E8: processing URL http://www.adt.co.uk/new-vacancies.html at line 822 byte 55 running JavaScript at line 692
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Walk stops early

Post by mark »

What's your Texis version? From a command prompt run

INSTALLDIR\texis.exe -version

and paste the result back here.
kiddywood
Posts: 41
Joined: Tue Feb 10, 2009 2:49 pm

Walk stops early

Post by kiddywood »

C:\Program Files\thunderstone software>texis.exe -version
Texis Web Script (Vortex) Copyright (c) 1996-2008 Thunderstone - EPI, Inc.
Webinator Professional Version 5.01.1218161949 20080808 (i686-intel-winnt-64-32)
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Walk stops early

Post by mark »

I can't replicate that with just that url. Maybe it's setting related or something about earlier urls. Try setting Parallelism:Threads to 1. If you don't need javascript processing you could try turning that off.
If you remove that one url or move it to beginning of the list does it make a difference?
User avatar
Kai
Site Admin
Posts: 1272
Joined: Tue Apr 25, 2000 1:27 pm

Walk stops early

Post by Kai »

Open a tech support ticket (menu at top) with a copy of your texis -version and the ABEND message, and we'll continue to investigate.
kiddywood
Posts: 41
Joined: Tue Feb 10, 2009 2:49 pm

Walk stops early

Post by kiddywood »

I've tried starting another walk and it now ends on only 5 URLs.

I am getting these error messages:

Cannot create directory C:\Program Files: Cannot create a file when that file already exists

User PUBLIC has been added without a password.

Vortex (108224) ABEND: exception 0xC0000005 (ACCESS_VIOLATION) at ip 0x1001e89a processing

Unable to determine free space. Will proceed assuming there is enough. in the function create index
Post Reply