How to exclude the directory browsing page?

Post Reply
twu
Posts: 22
Joined: Fri Oct 19, 2007 2:57 pm

How to exclude the directory browsing page?

Post by twu »

Hi, I am creating a site that contains two URLs. e.g.
http://www.mysite.com/myApplication/ -- this contains all application pages.
http://www.mysite.com/NR/Resource/ -- this contains all the resourse documents that are linked in application pages. e.g. pdf files, doc files.

I put both of URLs as Base URL. Because there are several level subdirectory down at http://www.mysite.com/NR/Resource/, and there are no index page in them, only pdf files, so I have to enable Directory Browsing Option for http://www.mysite.com/NR/Resource/; The problem now is when user types "index of" to search, it will list all the directory index pages, how can I avoid this?
I did a little bit search, and find out modify dowalk script will do the trick, but is there a way that just use configuration? Such as exclude by field?
User avatar
jason112
Site Admin
Posts: 347
Joined: Tue Oct 26, 2004 5:35 pm

How to exclude the directory browsing page?

Post by jason112 »

If you're generating the pages, you can add a standard "robots" command to the page that tells Webinator (and other search engines) not to use the page's contents.

Add this in the <head> of the html pages:

<meta name="robots" content="noindex"/>
User avatar
jason112
Site Admin
Posts: 347
Joined: Tue Oct 26, 2004 5:35 pm

How to exclude the directory browsing page?

Post by jason112 »

twu
Posts: 22
Joined: Fri Oct 19, 2007 2:57 pm

How to exclude the directory browsing page?

Post by twu »

Thanks for your quick response, but actually those are not real pages, they are generated by webserver to list the sub directories and files in them.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

How to exclude the directory browsing page?

Post by mark »

They're still "pages" even if they're dynamically generated by the server. Jason's suggestion still applies.
twu
Posts: 22
Joined: Fri Oct 19, 2007 2:57 pm

How to exclude the directory browsing page?

Post by twu »

Thanks guys, I tried Jason's REX, but it didn't work. I found that the following setting will exclude most of index pages, except the parent directory(http://www.mysite.com/NR/Resource/).
Query: http://www.mysite.com/NR/Resource/
Field: URL
Exclude to: Links Only

The only index page showing will be the top directory, and pdfs are showing fine.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

How to exclude the directory browsing page?

Post by mark »

Post Reply