Page 1 of 2

Walk doesnt find index.html, but directory listing

Posted: Tue May 24, 2005 2:57 pm
by pete.smith
I think Im onto why our intranet search isnt working right. I have

internal/test/index.html

the index.html has JUST one link on it to "inside2/".

When I crawl internal/test/

The slurp results show a DIRECTORY listing of the /test/ dir not following the link to inside2. I have tried changing my browser agent, but no dice. IDeas?

Pete

Walk doesnt find index.html, but directory listing

Posted: Tue May 24, 2005 3:26 pm
by John
Which webserver are you using? Is it configured to use index.html as a default file, and does the webserver user have permission to read it?

Walk doesnt find index.html, but directory listing

Posted: Tue May 24, 2005 4:02 pm
by pete.smith
Apache 1.3, yes it most definetly reads that, if I bring up that in browser all I see is one simple link.

Walk doesnt find index.html, but directory listing

Posted: Tue May 24, 2005 4:42 pm
by mark
Webinator is web client much like your browser. It can't get underneath the index.html to the directory listing unless the webserver gives out that data. Your webserver must be configured to vary behavior based on something. Check the webserver's config. Also check it's error and transfer logs to see if anything odd is happening for that request from webinator vs your browser.

Try removing "index.html" from webinator's "index Name" setting.

Walk doesnt find index.html, but directory listing

Posted: Tue May 24, 2005 5:58 pm
by pete.smith
So should the List/Edit URLS for the index page NOT show

Body: Index of /test (listing of files in dir)

EVEN Though if I bring it up in my browser I just get one simple link? It should just show that one link in the words indexed ? This is most definetly the problem.

Walk doesnt find index.html, but directory listing

Posted: Tue May 24, 2005 9:40 pm
by mark
Yes. List/edit urls should show the body as similar to the text on the page you see in your browser.

Walk doesnt find index.html, but directory listing

Posted: Wed May 25, 2005 10:10 am
by pete.smith
OK Im on to the problem. We have symbolic links on our web server and that was causing it loop forever since it would see it as a brand new url and keep going. How can we stop it from doing symlinks like that?

Walk doesnt find index.html, but directory listing

Posted: Wed May 25, 2005 11:01 am
by mark
I can't guess how to fix your webserver, especially without details about the problem. Symbolic links wouldn't generally be a problem for apache.

Walk doesnt find index.html, but directory listing

Posted: Wed May 25, 2005 12:37 pm
by pete.smith
How can I make it search the DISK as WELL as the http:// part? We have an intranet where people dump files and dont link to them neccesarily.

Pete

Walk doesnt find index.html, but directory listing

Posted: Wed May 25, 2005 1:05 pm
by mark
If they're dumped into web directories they can be made accessable via http by not having an index.html file in the directory and configuring the webserver to allow automatic directory indexing.

Otherwise you'll need full Texis or an appliance to index file: urls.