Controlling what gets refreshed

Post Reply
julian_loy
Posts: 8
Joined: Thu May 20, 2004 9:30 am

Controlling what gets refreshed

Post by julian_loy »

Is there anyway to control what gets updated/looked at when I run a refresh? I'm running the refresh from the command line and am dealing with some potentially huge databases. I would like to specify what pages get refreshed and have only those pages and links found on those pages refreshed. I am using the URL file specify the pages, but other pages are getting refreshed/looked at as the refresh times expire. Is there some setting I'm missing or do I need to hack the dowalk script. I started playing with changing the script and did not have much success.

Thanks for your help.
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Controlling what gets refreshed

Post by John »

The refresh will refresh any urls that have a refresh time that is current in the NextCheck field of the database. You could use SQL to set the NextCheck field appropriately for your needs, either as the page is refreshed, or when you start a refresh. That might need some editing of the dowalk script.
John Turnbull
Thunderstone Software
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

Controlling what gets refreshed

Post by KMandalia »

When I created a new profile, I set the default/min/max refresh time to be 1 year and did a complete walk. After that, I changed the nextcheck date for about 3% of rows in html table to 5 years instead of 1 year since I never want to refresh those pages.

What if I now want to do a weekly refresh of those 97% pages but never touch the 3% percent that has nextcheck set to 5 years. How to do that?
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Controlling what gets refreshed

Post by John »

Set min/max/default to 1 week, and then update the NextCheck field for those 97% of pages to +1 week, e.g.

update html set NextCheck = '+1 week' where NextCheck < '+1 year';
John Turnbull
Thunderstone Software
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

Controlling what gets refreshed

Post by KMandalia »

So, min/max/default directly corresponds to NextCheck but will not change it in the database for already walked pages but new pages will have the NextCheck set to 1 week.

Am I right?
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Controlling what gets refreshed

Post by John »

Min/max/default are used to calculate NextCheck when a page is visited, either the first time (default) or as a refresh. Changing them does not change any existing NextCheck values. New pages get the default. The refresh will look at pages that have a NextCheck earlier than the current date/time.
John Turnbull
Thunderstone Software
Post Reply