Walking Errors

Post Reply
velevi
Posts: 42
Joined: Thu Sep 08, 2005 12:21 pm

Walking Errors

Post by velevi »

Hi. I am working with Webinator Pro. The sites I am indexing into a profile amount to about 7000 pages and documents. (The html.tbl file is about 100 MB, the refs.tbl and other ref's are about 19 MB).

I am not sure whether the following errors have anything to do with the size of the profile database. The dowalk script is scheduled to trigger a "Refresh" walk every morning.

Then I have been getting the following:

0 pages fetched (0 bytes) from <some url>
started 1 refresh (5069) on <some url>
0 pages fetched (0 bytes) from <some url>
started 1 refresh (5070) on <some url>
0 pages fetched (0 bytes) from <some url>
started 1 refresh (5071) on <some url>
0 pages fetched (0 bytes) from <some url>
7386 pages fetched (810,513,136 bytes) Total
-24596 errors Total
29319 duplicate pages Total

or other similar before the end:

started 1 refresh (935) on <url 1>
started 2 refresh (949) on <url 2>
started 3 refresh (951) on <url 3>
0 pages fetched (0 bytes) from <url 3>
started 3 new (953) on <url 4>
0 pages fetched (39,759 bytes) from <url 4>
started 3 refresh (955) on <url 5>
0 pages fetched (0 bytes) from <url 5>
started 3 refresh (956) on <url 6>
0 pages fetched (0 bytes) from <url 6>
started 3 refresh (957) on <url 7>
started 4 refresh (966) on <url 8>
0 pages fetched (0 bytes) from <url 8>
started 4 refresh (987) on <url 9>
0 pages fetched (0 bytes) from <url 9>
started 4 refresh (988) on <url 10>
0 pages fetched (0 bytes) from <url 10>
started 4 refresh (989) on <url 11>
0 pages fetched (0 bytes) from <url 11>
started 4 refresh (998) on <url 12>
started 5 refresh (1010) on url 13/
0 pages fetched (0 bytes) from <url 13>
started 5 refresh (1017) on <url 14>
0 pages fetched (0 bytes) from <url 14>

The profile is functioning properly and is definitely searchable, but these errors are just baffling.
Can you decrypt that information and would you have any clue where the error is coming from?

Thank you very much!!
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Walking Errors

Post by John »

I'm not sure I see any errors there. On the Walk Status page how many pages are scheduled for refresh?
John Turnbull
Thunderstone Software
velevi
Posts: 42
Joined: Thu Sep 08, 2005 12:21 pm

Walking Errors

Post by velevi »

Many pages are schedule to refresh. I just wasn't sure, what's causing the errors:

1937 errors "Keep-Alive content received without Content-Length from..."
251 errors "Duplicate of..."
30 other errors

But rather I was confused why the error count was so big, and it was negative for all errors. It looks like it just is not being reset and keeps becoming more and more negative. It looks as though, if a Refresh walk is scheduled that happens. If I run a New walk, the problem goes away.

I have changed the 'dowalk' script a bit, but I haven't touched the part which 'summarized' the walk. Do you have any knowledge of a bug, and how to fix it?

Thanks
(The statements above were also baffling, so they were not the errors per se).
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Walking Errors

Post by mark »

What version of scripts? IIS servers often don't work right with keep-alive on. Edit the script to turn off maxkeepaliverequests.

The running tally of errors can sometimes go negative with refresh walks. Ignore the total tally or get the latest scripts. (Something about dups being counted as errors sometimes but not others)
Post Reply