Page 1 of 1

Excludes

Posted: Wed Aug 13, 2003 2:32 pm
by mmcfadden
I am trying to exclude from the database only the table of contents pages we use to create the index. We don't want a search to turn up a table of contents page. However we don't want to exclude the pages that are linked to on those table of contents pages. It seems to me that this can be done with the Extra domains in combination with the excludes. Any ideas I can try.

Excludes

Posted: Wed Aug 13, 2003 2:46 pm
by mark
Put
<meta name="robots" content="noindex,follow">
on the contents pages. There's no combination of walk settings that will do that.

http://www.thunderstone.com/texis/site/ ... eta-robots

Excludes

Posted: Wed Feb 25, 2004 2:55 pm
by mmcfadden
I have successfully used <meta name="robots" content="noindex,follow"> on my site. I would like to automatically remove from the index Table of contents pages. Is there a way to set up an after index exclude or removal of URLs with a certain pattern that works in the dowalk? Also I would like to understand the difference between the Excludes and Prefix Excludes and when you would want to use a Prefix Exlude. I have typically used for example http://mysite.com/docs/* as an exclude or a single URL http://mysite.com/docs/file.doc.

Excludes

Posted: Wed Feb 25, 2004 4:13 pm
by mark
You mean you want to exclude the noindex,follow pages? They're in the database with url only, no data, so a search shouldn't find them. But they remain to make the parent-child surfing work. If you want them completely out you can change dowalk
<$SSc_metarobotsplaceholder=Y>
to
<$SSc_metarobotsplaceholder=N>

You would have to modify dowalk to delete records after a walk.

Excludes match anywhere within the url. Prefix excludes only match the beginning. Given urls
http://site/dir/private/something.html
http://site/otherdir/private/another.html
You could exclude them with 2 prefixes
http://site/dir/private/
http://site/otherdir/private/
or one exclude
/private

Just different ways of approaching things. One will work better than the other in some circumstances. Sometimes either will be just as well.