Page 1 of 2

webinator - dowalk not indexing - japanese site

Posted: Tue May 31, 2005 12:28 pm
by scott.shaver
Okay I've read through the other posts here about problems indexing Japanese sites and I can't make heads or tails of the answers. I create a new profile for a japanese site that uses the shift_jis charset, changing only the base URL in the profile.

When I start the walk it simply sits there and spins. No pages are indexed. The stop walk buttons don't work I have to delete the profile to stop it.

Same thing happens when I try to index yahoo's japanese site. Same thing happens when I change the Source Default Charset in the profile to shift_jis.

Some specific information on how to set up a profile for a foriegn language site would great.

webinator - dowalk not indexing - japanese site

Posted: Tue May 31, 2005 2:13 pm
by Kai
Which version of Webinator are you using (complete output of texis -version, run from the command line)?

webinator - dowalk not indexing - japanese site

Posted: Tue May 31, 2005 2:25 pm
by scott.shaver
Texis Web Script (Vortex) Copyright (c) 1996-2005 Thunderstone - EPI, Inc
Free Webinator Version 5.01.1116433182 20050518 (i686-intel-winnt-32-32)

Just a note I'm trying to do this to prove to my company that is it worth my time to do a full eval of the appliance.

webinator - dowalk not indexing - japanese site

Posted: Tue May 31, 2005 3:29 pm
by Kai
There was a fix just after that release for an issue where JavaScript setInterval() or setTimeout() calls on a page could cause an abend or loop. Is there JavaScript on the Base URL that you are using? Also, check Texis\vortex.log in the Webinator install dir for any errors.

webinator - dowalk not indexing - japanese site

Posted: Tue May 31, 2005 3:39 pm
by scott.shaver
http://mcdatajp.hkit4u.com/ - Japanese
http://mcdatakr.hkit4u.com/ - Korean

There is Javascript code on those pages however there are no references to the functions you mentioned.

Here are the log entries releated to my attempts to index a japanese and korean site.

100 2005-05-31 09:00:27 f:\Webinator_running\texis\scripts/webinator/dowalk:3885: Document not found: http://mcdatajp.hkit4u.com/robots.txt returned code 404 (Object Not Found)
100 2005-05-31 09:00:28 f:\Webinator_running\texis\scripts/webinator/dowalk:3885: Document not found: http://mcdatajp.hkit4u.com/robots.txt returned code 404 (Object Not Found)
000 2005-05-31 09:16:53 f:\Webinator_running\texis\scripts/webinator/dowalk:5375: Wrong server id
000 2005-05-31 09:16:53 f:\Webinator_running\texis\scripts/webinator/dowalk:5375: Wrong server id
000 2005-05-31 09:16:53 f:\Webinator_running\texis\scripts/webinator/dowalk:5375: Wrong server id
000 2005-05-31 09:16:53 f:\Webinator_running\texis\scripts/webinator/dowalk:5375: Wrong server id
000 2005-05-31 09:16:53 f:\Webinator_running\texis\scripts/webinator/dowalk:5375: Wrong server id
000 2005-05-31 09:16:53 f:\Webinator_running\texis\scripts/webinator/dowalk:4992: Wrong server id
002 2005-05-31 09:16:53 f:\Webinator_running\texis\scripts/webinator/dowalk:4992: Unable to open table f:\Webinator_running\texis\japan\db2\counts in the function opendbtbl
115 2005-05-31 09:16:53 f:\Webinator_running\texis\scripts/webinator/dowalk:4992: No such table: counts in the database: f:\Webinator_running\texis\japan\db2\
000 2005-05-31 09:16:53 f:\Webinator_running\texis\scripts/webinator/dowalk:4992: SQLPrepare() failed with -1 in the function prepntexis
100 2005-05-31 09:17:52 /webinator/dowalk:9525: User PUBLIC has been added without a password.
100 2005-05-31 09:18:41 f:\Webinator_running\texis\scripts/webinator/dowalk:2071: User PUBLIC has been added without a password.
100 2005-05-31 09:18:42 f:\Webinator_running\texis\scripts/webinator/dowalk:3885: Document not found: http://mcdatakr.hkit4u.com/robots.txt returned code 404 (Object Not Found)
100 2005-05-31 09:18:43 f:\Webinator_running\texis\scripts/webinator/dowalk:3885: Document not found: http://mcdatakr.hkit4u.com/robots.txt returned code 404 (Object Not Found)
000 2005-05-31 09:38:55 f:\Webinator_running\texis\scripts/webinator/dowalk:5375: Wrong server id
000 2005-05-31 09:38:55 f:\Webinator_running\texis\scripts/webinator/dowalk:5375: Wrong server id
000 2005-05-31 09:38:55 f:\Webinator_running\texis\scripts/webinator/dowalk:5375: Wrong server id
000 2005-05-31 09:38:55 f:\Webinator_running\texis\scripts/webinator/dowalk:5375: Wrong server id
000 2005-05-31 09:38:55 f:\Webinator_running\texis\scripts/webinator/dowalk:5375: Wrong server id
000 2005-05-31 09:38:55 f:\Webinator_running\texis\scripts/webinator/dowalk:4992: Wrong server id
002 2005-05-31 09:38:55 f:\Webinator_running\texis\scripts/webinator/dowalk:4992: Unable to open table f:\Webinator_running\texis\korea\db2\counts in the function opendbtbl
115 2005-05-31 09:38:55 f:\Webinator_running\texis\scripts/webinator/dowalk:4992: No such table: counts in the database: f:\Webinator_running\texis\korea\db2\
000 2005-05-31 09:38:55 f:\Webinator_running\texis\scripts/webinator/dowalk:4992: SQLPrepare() failed with -1 in the function prepntexis
000 2005-05-31 09:38:55 f:\Webinator_running\texis\scripts/webinator/dowalk:5779: Wrong server id
002 2005-05-31 09:38:55 f:\Webinator_running\texis\scripts/webinator/dowalk:5779: Unable to open table f:\Webinator_running\texis\korea\db2\error in the function opendbtbl
115 2005-05-31 09:38:55 f:\Webinator_running\texis\scripts/webinator/dowalk:5779: No such table: error in the database: f:\Webinator_running\texis\korea\db2\
000 2005-05-31 09:38:55 f:\Webinator_running\texis\scripts/webinator/dowalk:5779: SQLExecute() failed with -1 in the function execntexis
000 2005-05-31 09:38:55 f:\Webinator_running\texis\scripts/webinator/dowalk:5780: Wrong server id
002 2005-05-31 09:38:55 f:\Webinator_running\texis\scripts/webinator/dowalk:5780: Unable to open table f:\Webinator_running\texis\korea\db2\error in the function opendbtbl
115 2005-05-31 09:38:55 f:\Webinator_running\texis\scripts/webinator/dowalk:5780: No such table: error in the database: f:\Webinator_running\texis\korea\db2\
000 2005-05-31 09:38:55 f:\Webinator_running\texis\scripts/webinator/dowalk:5780: SQLExecute() failed with -1 in the function execntexis
100 2005-05-31 09:39:30 /webinator/dowalk:9525: User PUBLIC has been added without a password.
100 2005-05-31 09:41:44 f:\Webinator_running\texis\scripts/webinator/dowalk:2071: User PUBLIC has been added without a password.
100 2005-05-31 09:41:46 f:\Webinator_running\texis\scripts/webinator/dowalk:3885: Document not found: http://mcdatajp.hkit4u.com/robots.txt returned code 404 (Object Not Found)

webinator - dowalk not indexing - japanese site

Posted: Wed Jun 01, 2005 11:57 am
by John
That looks like a more general error. Does the user that the scripts will run as (typically the IUSR_ account) have full control to the dataspace directory? It looks as if if did not create the database correctly. If you set the permissions and try a new walk again it should work.

webinator - dowalk not indexing - japanese site

Posted: Wed Jun 01, 2005 12:15 pm
by scott.shaver
Well I'm assuming it has full access since it has no problem with creating the databases for the three english sites I've indexed. Those errors may just be becuase I deleted the profiles and it didn't really delete them. So when I tried to create them again with the same names it puked.

I created a new profile again today using the defaults and changing the base url to http://mcdatajp.hkit4u.com and it still just sits and spins. here is the log output:

100 2005-06-01 10:13:35 f:\Webinator_running\texis\scripts/webinator/dowalk:2071: User PUBLIC has been added without a password.
100 2005-06-01 10:13:36 f:\Webinator_running\texis\scripts/webinator/dowalk:3885: Document not found: http://mcdatajp.hkit4u.com/robots.txt returned code 404 (Object Not Found)
100 2005-06-01 10:13:37 f:\Webinator_running\texis\scripts/webinator/dowalk:3885: Document not found: http://mcdatajp.hkit4u.com/robots.txt returned code 404 (Object Not Found)

webinator - dowalk not indexing - japanese site

Posted: Wed Jun 01, 2005 12:16 pm
by scott.shaver
and from the monitor.log file

200 2005-06-01 10:12:18 (9272) Database Monitor on f:\Webinator_running\texis\jp2\db1 starting
200 2005-06-01 10:13:18 (9272) Database Monitor on f:\Webinator_running\texis\jp2\db1 exiting
200 2005-06-01 10:13:37 (9308) Database Monitor on f:\Webinator_running\texis\jp2\db2 starting
200 2005-06-01 10:13:39 (8948) Database Monitor on f:\Webinator_running\texis\jp2\db1 starting
200 2005-06-01 10:14:39 (8948) Database Monitor on f:\Webinator_running\texis\jp2\db1 exiting

webinator - dowalk not indexing - japanese site

Posted: Wed Jun 01, 2005 12:58 pm
by scott.shaver
Looking a bit closer those sites appear to be some sort of ASP.NET app. It seems to be doing some sort of funky redirect from the base URL of http://mcdatajp.hkit4u.com. I used telnet to hit that url and got back an odd page, not the same one as shows up in the browser.

I tried the following in a browser:

http://mcdatajp.hkit4u.com/index.html
http://mcdatajp.hkit4u.com/index.htm
http://mcdatajp.hkit4u.com/default.html
http://mcdatajp.hkit4u.com/default.htm
http://mcdatajp.hkit4u.com/default.asp
http://mcdatajp.hkit4u.com/index.asp

and get and 404 back on each one. I think the ASP thing is causing the problem. Unfortunately our group doesn't control these sites right now as they are temporary outsource solutions for asia pacific.

webinator - dowalk not indexing - japanese site

Posted: Wed Jun 01, 2005 1:17 pm
by scott.shaver
attempted to index http://cnn.co.jp/ and got the same spinning problem. attempted to index french site which kind of worked, no spinning, but the live search never returned any results.

has anyone actually used the free webinator to index a non-english site?