I run a walk, it gets to some number of pages, and seems to get stuck, maybe looping. The number of pages processed remains the same, but the number of duplicates and the number of errors continues to increase, and will seemingly do so forever.
We are using Commercial Webinator 4.3.7 for Windows, with the PDF plugin.
For whatever reason, the texis version appears to be different: Commercial Webinator Version 4.02.1031937844 of Sep 13, 2002
We are walking http://www.moen.com, which is publically accessible. I'll give you the walk settings if you want to try it yourself.
Here's an excerpt of what we are seeing on for the walk status:
Webinator Walk Report for Builder
Creating database d:\thunderstonesoftware\webinator/texis/Builder/db1...Done.
Walk started at 2004-04-02 11:25:02 (by user)
Verbosity set to 3
JavaScript walking not enabled by current license
HTTPS walking disabled
Start fetching at http://www.moen.com/Builder/BuilderHome.cfm
http://www.moen.com/Builder/BuilderHome.cfm
Ignore urls containing any of the following:
/cgi-bin/
~
ptype=w
ptype=r
ptype=b
ptype=c
/productcatalog/
contest=
&page=
DealerInfoAction
started 1 (5700) on http://www.moen.com/Builder/BuilderHome.cfm
798 pages fetched (17,555,478 bytes) from http://www.moen.com/Builder/BuilderHome.cfm
started 1 (860) on http://showhouse.moen.com/
421 pages fetched (89,066,182 bytes) from http://showhouse.moen.com/
started 1 (4572) on http://www.moen.com/Consumer/legal.cfm
11 pages fetched (11,964,319 bytes) from http://www.moen.com/Consumer/legal.cfm
started 1 (5464) on http://showhouse.moen.com/
0 pages fetched (89,066,182 bytes) from http://showhouse.moen.com/
started 1 (4572) on http://www.moen.com/Consumer/legal.cfm
0 pages fetched (11,964,319 bytes) from http://www.moen.com/Consumer/legal.cfm
started 1 (5464) on http://showroomofdistinction.moen.com/
0 pages fetched (0 bytes) from http://showroomofdistinction.moen.com/
started 1 (5464) on http://showhouse.moen.com/
0 pages fetched (89,066,182 bytes) from http://showhouse.moen.com/
started 1 (4852) on http://www.moen.com/
0 pages fetched (11,953,080 bytes) from http://www.moen.com/
started 1 (4572) on http://showhouse.moen.com/
0 pages fetched (89,066,182 bytes) from http://showhouse.moen.com/
started 1 (5464) on http://www.moen.com/
0 pages fetched (11,953,080 bytes) from http://www.moen.com/
started 1 (4572) on http://showhouse.moen.com/
0 pages fetched (89,066,182 bytes) from http://showhouse.moen.com/
started 1 (5464) on http://www.moen.com/
0 pages fetched (11,953,080 bytes) from http://www.moen.com/
started 1 (4572) on http://showhouse.moen.com/
0 pages fetched (89,066,182 bytes) from http://showhouse.moen.com/
started 1 (5464) on http://www.moen.com/
0 pages fetched (11,953,080 bytes) from http://www.moen.com/
started 1 (5464) on http://showhouse.moen.com/
1230 pages (630,893,686 bytes) so far.
70 errors so far.
2448 duplicate pages so far.
1230 http://www.moen.com/Consumer/Products/K ... FPSink.cfm (14,567 bytes)
1229 http://www.moen.com/Consumer/BuyMoen/bu ... Number.cfm (747 bytes)
1228 http://www.moen.com/Consumer/products/s ... wering.cfm (13,488 bytes)
There is nothing in the vortex.log file since the start of the latest rewalk, but there is some older info in there, probably from my killing previous runs:
000 Apr 2 11:01:57 [webinatoradmin=webinatoradmin]
Index d:\thunderstonesoftware\webinator\texis\Builder\db1\xhtmlid reported to exist, but does not. in the function opendbidx
006 Apr 2 11:02:04 [webinatoradmin=webinatoradmin]
(5328) Can't write stdout via web server to 10.4.3.76 (Broken pipe); exiting
000 Apr 2 11:02:08 [webinatoradmin=webinatoradmin]
Index d:\thunderstonesoftware\webinator\texis\Builder\db1\xhtmlid reported to exist, but does not. in the function opendbidx
006 Apr 2 11:02:12 (812) Can't write stdout (Bad file descriptor); exiting
Suggestions?
We are using Commercial Webinator 4.3.7 for Windows, with the PDF plugin.
For whatever reason, the texis version appears to be different: Commercial Webinator Version 4.02.1031937844 of Sep 13, 2002
We are walking http://www.moen.com, which is publically accessible. I'll give you the walk settings if you want to try it yourself.
Here's an excerpt of what we are seeing on for the walk status:
Webinator Walk Report for Builder
Creating database d:\thunderstonesoftware\webinator/texis/Builder/db1...Done.
Walk started at 2004-04-02 11:25:02 (by user)
Verbosity set to 3
JavaScript walking not enabled by current license
HTTPS walking disabled
Start fetching at http://www.moen.com/Builder/BuilderHome.cfm
http://www.moen.com/Builder/BuilderHome.cfm
Ignore urls containing any of the following:
/cgi-bin/
~
ptype=w
ptype=r
ptype=b
ptype=c
/productcatalog/
contest=
&page=
DealerInfoAction
started 1 (5700) on http://www.moen.com/Builder/BuilderHome.cfm
798 pages fetched (17,555,478 bytes) from http://www.moen.com/Builder/BuilderHome.cfm
started 1 (860) on http://showhouse.moen.com/
421 pages fetched (89,066,182 bytes) from http://showhouse.moen.com/
started 1 (4572) on http://www.moen.com/Consumer/legal.cfm
11 pages fetched (11,964,319 bytes) from http://www.moen.com/Consumer/legal.cfm
started 1 (5464) on http://showhouse.moen.com/
0 pages fetched (89,066,182 bytes) from http://showhouse.moen.com/
started 1 (4572) on http://www.moen.com/Consumer/legal.cfm
0 pages fetched (11,964,319 bytes) from http://www.moen.com/Consumer/legal.cfm
started 1 (5464) on http://showroomofdistinction.moen.com/
0 pages fetched (0 bytes) from http://showroomofdistinction.moen.com/
started 1 (5464) on http://showhouse.moen.com/
0 pages fetched (89,066,182 bytes) from http://showhouse.moen.com/
started 1 (4852) on http://www.moen.com/
0 pages fetched (11,953,080 bytes) from http://www.moen.com/
started 1 (4572) on http://showhouse.moen.com/
0 pages fetched (89,066,182 bytes) from http://showhouse.moen.com/
started 1 (5464) on http://www.moen.com/
0 pages fetched (11,953,080 bytes) from http://www.moen.com/
started 1 (4572) on http://showhouse.moen.com/
0 pages fetched (89,066,182 bytes) from http://showhouse.moen.com/
started 1 (5464) on http://www.moen.com/
0 pages fetched (11,953,080 bytes) from http://www.moen.com/
started 1 (4572) on http://showhouse.moen.com/
0 pages fetched (89,066,182 bytes) from http://showhouse.moen.com/
started 1 (5464) on http://www.moen.com/
0 pages fetched (11,953,080 bytes) from http://www.moen.com/
started 1 (5464) on http://showhouse.moen.com/
1230 pages (630,893,686 bytes) so far.
70 errors so far.
2448 duplicate pages so far.
1230 http://www.moen.com/Consumer/Products/K ... FPSink.cfm (14,567 bytes)
1229 http://www.moen.com/Consumer/BuyMoen/bu ... Number.cfm (747 bytes)
1228 http://www.moen.com/Consumer/products/s ... wering.cfm (13,488 bytes)
There is nothing in the vortex.log file since the start of the latest rewalk, but there is some older info in there, probably from my killing previous runs:
000 Apr 2 11:01:57 [webinatoradmin=webinatoradmin]
006 Apr 2 11:02:04 [webinatoradmin=webinatoradmin]
000 Apr 2 11:02:08 [webinatoradmin=webinatoradmin]
006 Apr 2 11:02:12 (812) Can't write stdout (Bad file descriptor); exiting
Suggestions?