Don't want to walk index pages

Post Reply
steve_hart
Posts: 3
Joined: Mon Aug 16, 2004 8:54 pm

Don't want to walk index pages

Post by steve_hart »

steve_hart
Posts: 3
Joined: Mon Aug 16, 2004 8:54 pm

Don't want to walk index pages

Post by steve_hart »

I don't want the index.php pages to show in any search results. I tried adding "index.php" to the exclusion parameters, but this did not exclude the pages from the search results.

You mentioned I could exclude by field. Which field would you recommend using? URL? What would the query string look like?

Thank for you help.
renloe
Posts: 35
Joined: Mon Jan 31, 2005 12:51 pm

Don't want to walk index pages

Post by renloe »

Will this auto recognize url's like:

http://my.site.com/dir1/somephpdir/

This directory uses an index.php file, but it is not in the actual url. Will it still eliminate these entries from the search results?

Our tests show that it does not. Is there a better way to eliminate these entries?
renloe
Posts: 35
Joined: Mon Jan 31, 2005 12:51 pm

Don't want to walk index pages

Post by renloe »

No there is not. I believe we could add some Meta tags to the pages though if needed.
What would you recommend that the meta tag be if we were to go this route?

<meta index="true">
Would something like this work? If so, what would we need to add into each of the Exclude by Field options?
Thank You
renloe
Posts: 35
Joined: Mon Jan 31, 2005 12:51 pm

Don't want to walk index pages

Post by renloe »

We went ahead and added the meta tag:
<meta name="indexpage" content="true" />

And added into the config:
Query = name
meta = indexpage
field = html
exclude = pages only

But this is not working correctly. It does ignore all pages with the meta tag, but it does not look like it is following all the links on the ignored pages. So we end up missing several hundred pages from the index.
renloe
Posts: 35
Joined: Mon Jan 31, 2005 12:51 pm

Don't want to walk index pages

Post by renloe »

I actually found an error in my dowalk script causing that problem, but after fixing that, I re-ran the indexing with verbosity set to 4, and the Exclude by field option did nothing.
Webinator indexed all the pages with the newly added "indexpage" meta tag, and all those pages we were trying to eliminate are now showing up in the search results...
renloe
Posts: 35
Joined: Mon Jan 31, 2005 12:51 pm

Don't want to walk index pages

Post by renloe »

There are no errors in the vortex.log file.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Don't want to walk index pages

Post by mark »

I think the exclude by field settings you want are
Query = true
meta = indexpage
exclude = pages only
And make sure you're doing "new" rather than "refresh" walks.

If you've customized dowalk try an unmodified one from the website example scripts page to see if the same problem occurs.
Post Reply