Page 1 of 1

Why are links being followed?

Posted: Fri Apr 18, 2008 3:49 pm
by online2008
I have a text file with under 600 urls. I put the url of this file in the Page URL field and set the Max Depth to 0.

When I check the walk status, it shows 19,999 pages in the index. Should it not be just about 600 pages to be indexed?

More to the point, the search results are turning up irrelevant information. How do I limit the search to just those 600 pages?

I had also tried putting the 600 urls in the Single Page field, and had the same results (19,999 pages in the index).

Why are links being followed?

Posted: Fri Apr 18, 2008 5:04 pm
by mark
What do you have in Base URL? Extra domains?

Is rewalk mode set to new or refresh? You probably want new.

Why are links being followed?

Posted: Fri Apr 18, 2008 6:17 pm
by online2008
Hello Mark,

The Base URL is the same as the server on which Webinator was uploaded. Nothing is in Extra domains.

I've changed the rewalk mode to new to see what different results this will produce.

--C

Why are links being followed?

Posted: Sun Apr 20, 2008 7:44 am
by online2008
Hello Mark,

We've made the following changes and now have the following setup:

--the Base URL, Watch URL, and Page URL are all the same, ie the text file with the list of URLs to be searched

--Max Depth is 0

--nothing is in Extra domains

--Rewalk is New


Happily, Webinator is now working exactly as required, in that it searches and returns results from only the pages whose urls we've listed in the text file.

Now that the old, 19,9999-page walk has been purged, we'll change the Rewalk to Refresh.

Is there anything in this setup that will cause problems down the road?


--C

Why are links being followed?

Posted: Mon Apr 21, 2008 10:37 am
by mark
Base URL and Page URL should not be the same. They mean different things. In your case probably just use Base URL with max depth 0 and no Page URL.

Watch URL is only meaningful if you've set rewalk schedule to on change.

Why are links being followed?

Posted: Mon Apr 21, 2008 12:04 pm
by online2008
Okay.

Thank you very much Mark, for sterling support.

--C