Don't want to walk index pages

steve_hart
Posts: 3
Joined: Mon Aug 16, 2004 8:54 pm

Don't want to walk index pages

Post by steve_hart »

User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Don't want to walk index pages

Post by John »

What do you want done with the index.php pages? If you want links followed but the page not stored in the index you could use exclude by field to exclude those pages.
John Turnbull
Thunderstone Software
steve_hart
Posts: 3
Joined: Mon Aug 16, 2004 8:54 pm

Don't want to walk index pages

Post by steve_hart »

I don't want the index.php pages to show in any search results. I tried adding "index.php" to the exclusion parameters, but this did not exclude the pages from the search results.

You mentioned I could exclude by field. Which field would you recommend using? URL? What would the query string look like?

Thank for you help.
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Don't want to walk index pages

Post by John »

Yes, exclude by field URL, the query would be index.php, and you would want to exclude Pages Only. This would not index the content of pages with index.php in the URL, but still follow any links.
John Turnbull
Thunderstone Software
renloe
Posts: 35
Joined: Mon Jan 31, 2005 12:51 pm

Don't want to walk index pages

Post by renloe »

Will this auto recognize url's like:

http://my.site.com/dir1/somephpdir/

This directory uses an index.php file, but it is not in the actual url. Will it still eliminate these entries from the search results?

Our tests show that it does not. Is there a better way to eliminate these entries?
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Don't want to walk index pages

Post by John »

No. Is there anything in the text or otherwise unique about the page to identify it?
John Turnbull
Thunderstone Software
renloe
Posts: 35
Joined: Mon Jan 31, 2005 12:51 pm

Don't want to walk index pages

Post by renloe »

No there is not. I believe we could add some Meta tags to the pages though if needed.
What would you recommend that the meta tag be if we were to go this route?

<meta index="true">
Would something like this work? If so, what would we need to add into each of the Exclude by Field options?
Thank You
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Don't want to walk index pages

Post by John »

If you went that route then the Query would be true, the Meta would be index and the Exclude would be "Pages only"
John Turnbull
Thunderstone Software
renloe
Posts: 35
Joined: Mon Jan 31, 2005 12:51 pm

Don't want to walk index pages

Post by renloe »

We went ahead and added the meta tag:
<meta name="indexpage" content="true" />

And added into the config:
Query = name
meta = indexpage
field = html
exclude = pages only

But this is not working correctly. It does ignore all pages with the meta tag, but it does not look like it is following all the links on the ignored pages. So we end up missing several hundred pages from the index.
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Don't want to walk index pages

Post by John »

If you run with Verbosity set to 4 then the pages being excluded will still be listed under List/Edit URLs, and you can choose children to see why the children were not indexed.
John Turnbull
Thunderstone Software
Post Reply