Reindex new content

Post Reply
legedza.henry
Posts: 142
Joined: Wed Jul 24, 2002 11:52 pm

Reindex new content

Post by legedza.henry »

Again in your newsletter there was this question posed: Does your system index new and changed content every few minutes?

How can I set webinator (Webinator 5.1.46) to do this: Our intranet has over 12,000 links and normally takes 3 or so hours to complete and as such is usually run every couple of days.

I would prefer to have it indexing as new content is added but am not sure how.

Thanks
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Reindex new content

Post by John »

The most common method is to set Webinator to do a Refresh crawl every minute. You can adjust the minimum and maximum refresh periods to suit your needs. This works best if your webserver provides accurate Modified information, and respects the "If-Modified-Since" header to allow for fast checking if the content has changed.

In addition if there is a page that contains links to new content then you can add that URL to the Watch Url so that new content will be added every time.

Depending on how the content is updated you may be able to get and use a list of recently changed documents that can be used in the crawl.
John Turnbull
Thunderstone Software
Post Reply