Indexing a directory

Post Reply
michaelbarton
Posts: 13
Joined: Mon May 23, 2005 3:27 pm

Indexing a directory

Post by michaelbarton »

I have a directory of files (word, PDF, ...) There are no html files. I can view the contents of the directory through a browser, but I can't get Webinator to index the directory. Is there a setting in the tool that allows it to spider the contents of a directory? Is there a specific format to the baseurl field?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Indexing a directory

Post by mark »

The base url should be the same as you use in your browser (make sure you have the std required trailing / on the url). A directory listing will look like any other html page to the browser or web indexer. Check the walk status to see what it said about that page. Did it get that page or not? If not, what was the error? If so check list/edit urls for that url to see what content and children were found on that page. Try setting verbosity to 4 and doing a new (not refresh) walk to get more info about discarded urls.

Make sure your robots.txt and meta robots values or walk settings allow indexing of that page.
michaelbarton
Posts: 13
Joined: Mon May 23, 2005 3:27 pm

Indexing a directory

Post by michaelbarton »

I get a timeout message:
Timeout sending data to xxxx.xxxx.xxxx.com:80
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Indexing a directory

Post by mark »

That sounds like a connectivity issue. Is the appliance on the same network as the workstation that's able to reach the server? Does the server or any firewall between it and the appliance allow the appliance the same kind of access as your workstation?

Or if the server's just slow to respond, increase the page time under all walk settings.
Post Reply