Page 1 of 1
Search Appliance does not respect robot META tags
Posted: Tue May 23, 2006 8:55 am
by dietric
I have a profile set up to respect robot META tags.
However, it still indexes pages that are specified as noindex,nofollow:
http://sandsports.off-road.com/dunes/ma ... ?id=227437
Search Appliance does not respect robot META tags
Posted: Tue May 23, 2006 10:03 am
by John
If you check the link under List/Edit URLs does it have any content? An empty place holder is kept for the NOINDEX pages. It maybe more useful to add emailContent.jsp to the Exclusions to avoid fetching the page in the first place and then finding the NOINDEX.
Search Appliance does not respect robot META tags
Posted: Tue May 23, 2006 11:40 am
by dietric
They don't have any content.
I'm building the tags programatically, and the values are data-drive - moving this to a robots.txt file would be pretty complex. I'm mostly concerned about bogging down the appliance with indexing pages that are not eligible for search, and eating up my available indexes... Any thoughts on how this affects the walk durations?
Search Appliance does not respect robot META tags
Posted: Tue May 23, 2006 11:59 am
by John
The trouble with META robots tags is that they are not seen until the page has been downloaded and processed, whereas using Exclusions or robots.txt allows the determination to be made before hand.
An upcoming update to the search appliance will have an option to not store the placeholders.