Watch URL and walk schedule

dietric
Posts: 100
Joined: Fri May 20, 2005 10:57 am

Watch URL and walk schedule

Post by dietric »

Is it possible to index pages listed in the Watch URL on a different schedule than the normal refresh schedule?
I have too many profiles to be able to do a complete walk every day, but would like to ensure that updated content gets picked up quickly.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Watch URL and walk schedule

Post by mark »

There's not really any way to separate a group of urls from the normal refresh rules. If the servers provide last-modified information the refresh should home in on the changing data within a few cycles.

Or maybe you just need an additional appliance?
dietric
Posts: 100
Joined: Fri May 20, 2005 10:57 am

Watch URL and walk schedule

Post by dietric »

Well, I already have three appliances...;-) If I use "On Change" as the schedule and define a Watch URL, will a complete walk be performed or will just the links in the Watch URL be followed?

Thanks
-ds
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Watch URL and walk schedule

Post by mark »

A change in the watch url just triggers a normal full refresh walk.
dietric
Posts: 100
Joined: Fri May 20, 2005 10:57 am

Watch URL and walk schedule

Post by dietric »

Does that mean that the URL's specified in the "Watch URL" are not automatically indexed unless i also add the Watch URL to the Base URL's?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Watch URL and walk schedule

Post by mark »

If the watch url is a normal part of your walk you don't have to also include it in the base urls. If it's not and you want it in your database you need to add it.
dietric
Posts: 100
Joined: Fri May 20, 2005 10:57 am

Watch URL and walk schedule

Post by dietric »

I see. One last question: what criteria is being used to determine whether the "Watch URL" is interpreted as being updated? The content of the page, the last modified META tag, or both of them?

thanks
-ds
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Watch URL and walk schedule

Post by mark »

It looks for last-modifed and uses that if present. If it's not present it compares the text of the page (not the full html, only the textual portions).
dietric
Posts: 100
Joined: Fri May 20, 2005 10:57 am

Watch URL and walk schedule

Post by dietric »

User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Watch URL and walk schedule

Post by mark »

There's tons of whitespace on the head of every line. Try creating the file with no extraneous whitespace at beginning or end of line and no blank lines.