Each of our directories contains a subdirectory named 'rev' for
revision control. We do NOT want this directory indexed. Is there a
way to have webinator NOT look in these directories when an index.html
is not present?
Example: my.site.com/foo/ does not have an index.html file. Therefore
it will produce an "Index Of" page listing all files and directories.
Can we stop it from looking in the foo/rev/ foo/bar/rev/ directories
WITHOUT specifying the full paths as -xhttp://my.site.com/foo/rev/ in
a file for use with -m?
In theory what I'd like to do is say:
gw -xrev/ (I know this doesn't work, I've tried it.)
without having to say:
gw -s "delete from html where Url like '/rev/'"
or listing all full URLS
Temporarily, my work around is -mrev.set
Where rev.set was created using find . -name rev
Obviously, this is not a very good work around, since it requires
recreating the rev.set file each time.
---------
Also, a slightly unrelated question:
While evaluating your software, I have found that a few documents have
broken external links. Namely, they call "http://www.domain.com/"
there is no such domain name. Why does gw try resolving this?
I'm using all three possible ways of excluding this issue:
-jmy.site.com
-xhttp://www.domain.com/
-domain=my.site.com
None of those work. This DRASTICALLY slows down the index time, since
it will take about 65 seconds to time out of the DNS (Yes, I'm using
-w0). This link is included at least 5 times!
My biggest question here, is why is it trying to read www.domain.com?
With all three of those options set, it SHOULD (as I understand it)
completely disregard that link.
Thanks in advance,
Tim Rosine