can't rewalk

djeckel
Posts: 16
Joined: Thu Feb 08, 2001 3:36 pm

can't rewalk

Post by djeckel »

User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

can't rewalk

Post by mark »

You can try deleting the database and recreating it with -create, but that shouldn't really be any different than using -wipe.

What exact error message are you getting from gw?
What's your version and release as reported by gw -version ?
You might try using the gw option -dns=sys
gw -d- -dns=sys http://mydomain.com

Turning up verbosity might be helpful. Add a -v9 to your gw command line.
djeckel
Posts: 16
Joined: Thu Feb 08, 2001 3:36 pm

can't rewalk

Post by djeckel »

on Friday it took the -wipe but would not -create so we pulled the db from back ups.
Trying to -create, we kept getting "Can't resolve host 'mydomain.com': Timeout (no time left)
But I can get http://mydomain.com in the browser on that machine and ,/geturl http://mydomain.com successfully.

I just tried:
gw -dns=sys http://mydomain.com -rewalk
No database specified. Use the default?
Y
Getting http://206.1.96.2/robots.txt...Not there...Ok.
Adding todo: http://mydomain.com
http://mydomain.com/
0/0
011 Can't resolve host 'slackinc.com': Timeout (no time left)
0/0
Visited 1 pages total
Indexing new pages

adding -v9 to the command yielded similar results
here are the last few lines:

create metamorph inverted index xhtmlbod on html(Title\Meta\Body);
Host: 206.1.96.226:80 (walkable) mydomain.com
exclude:
rewalk
robots.txt:
Host: 206.1.96.226:80 (walkable) mydomain.com
getip() called 2 times. 0 hits
gethostbyname() called 2 times

We're running version 2.56
release 20000824
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

can't rewalk

Post by mark »

That particular release of Webinator had a problem with daylight savings that would cause that kind of timeout in DNS lookups. Thunderstone will contact you directly with instructions on how to download a fix.
djeckel
Posts: 16
Joined: Thu Feb 08, 2001 3:36 pm

can't rewalk

Post by djeckel »

Ok, after installing the fix, we still can't rewalk. I get these errors:

175 table hosts not found in the data dictionary.
115 No such table: hosts in the database: D:\wwwroot\webinator\_db\
000 SQLexecute <> failed with -1

100 Document access forbidden: http://www.slackinc.com/ returned code 403 <access forbidden>

visited 3 pages total
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

can't rewalk

Post by mark »

The first ones about hosts are just warnings that you only see with verbosity turned up. Ignore them.

The real problem is the 403 error from your webserver. www.slackinc.com is not allowing you to walk it. Check the webserver's error log and configuration to find out why and fix it to allow you to walk it.
djeckel
Posts: 16
Joined: Thu Feb 08, 2001 3:36 pm

can't rewalk

Post by djeckel »

We are using IIS4. Is there is a specific location where these settings (to allow to walk) would reside. Also, which logs, the server event logs?

I don't understand how it will visit 3 pages then stop.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

can't rewalk

Post by mark »

We can't know what you've setup that might be causing the webserver to deny permission to view the page to gw. A default setup doesn't generally cause any problems. Check the events logs.

What 3 pages is it getting successfully?
djeckel
Posts: 16
Joined: Thu Feb 08, 2001 3:36 pm

can't rewalk

Post by djeckel »

It's actually seems to have only gotten 2 - default and the webinator search resutls page.
What am I looking for in the events logs?

Thanks
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

can't rewalk

Post by mark »

Look for anything that occurs at the same time you're running gw.