Reindex or Delete a modified page near-time

Post Reply
carl
Posts: 5
Joined: Wed Sep 13, 2000 12:03 pm

Reindex or Delete a modified page near-time

Post by carl »

Hi,
How can I Reindex or Delete a single page from the command line without walking the whole site? I am using Webinator v5. dowalk seems to rely always on the NextCheck field, however, my documents can change from one minute to the next, and I'd like the reindexing or deletion to happen near-time rather than waiting 15 minutes.
thanks!
carl
User avatar
mark
Site Admin
Posts: 5513
Joined: Tue Apr 25, 2000 6:56 pm

Reindex or Delete a modified page near-time

Post by mark »

User avatar
mark
Site Admin
Posts: 5513
Joined: Tue Apr 25, 2000 6:56 pm

Reindex or Delete a modified page near-time

Post by mark »

Also under list/edit urls there's an "Update soon" option that will tell webinator to update that url at the beginning of the next refresh cycle.
carl
Posts: 5
Joined: Wed Sep 13, 2000 12:03 pm

Reindex or Delete a modified page near-time

Post by carl »

thanks Mark. a couple of followup questions:

1. I noticed this in my vortex.log file:
115 2004-07-29 13:25:27 /usr/local/morph3/texis/scripts/webinator/dowalk:69: Field NextCheck non-existent
000 2004-07-29 13:25:27 /usr/local/morph3/texis/scripts/webinator/dowalk:69: SQLExecute() failed with -1 in the function execntexis

this is generated when i rewalk. do you know what could be wrong?

2. can i update or delete an individual page from the command line without walking the whole site? something like this:
./texis -r profile="X" top="http://xyz/doc5.html" /usr/local/morph3/texis/scripts/webinator/dowalk/refresh.txt

where doc5.html is the only page i want to update.

thanks!
carl
User avatar
John
Site Admin
Posts: 2597
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Reindex or Delete a modified page near-time

Post by John »

1. That might happen if you are trying to refresh a Version 4 database.

2. You can do the refresh in two steps:

./texis -r profile="X" top="http://xyz/doc5.html" /usr/local/morph3/texis/scripts/webinator/dowalk/refreshnow.txt
./texis -r profile="X" /usr/local/morph3/texis/scripts/webinator/dowalk/refresh.txt

which will refresh doc5.html as well as any other pages scheduled to be refreshed.
John Turnbull
Thunderstone Software
carl
Posts: 5
Joined: Wed Sep 13, 2000 12:03 pm

Reindex or Delete a modified page near-time

Post by carl »

John,
in regard to this error:
dowalk:69: Field NextCheck non-existent

it's strange, because i have started using Webinator as of version 5 and did not have texis previously.

[root@cblake bin]# ./texis -version
Texis Web Script (Vortex) Copyright (c) 1996-2004 Thunderstone - EPI, Inc.
Free Webinator Version 5.00.1086121238 20040601 (i686-unknown-linux2.4.9-64-32)

i did make a change in the dowalk script to insert into the Visited column the $when variable as opposed to 'now'. could this possibly be causing this error with the NextCheck field?

thanks!
carl
User avatar
John
Site Admin
Posts: 2597
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Reindex or Delete a modified page near-time

Post by John »

Are you using the latest version of the scripts? What is on line 69?

Why change Visited to $when, as the Modified column stores $when?
John Turnbull
Thunderstone Software
carl
Posts: 5
Joined: Wed Sep 13, 2000 12:03 pm

Reindex or Delete a modified page near-time

Post by carl »

this is line 69:
<SQL MAX=1 "select Url x from html where NextCheck < $times">
it seems to give an error on this line, but other places in dowalk where NextCheck is used seem to work just fine.

I changed Visited to $when, because my documents have a published date and I need to sort search results by publication date descending. so I've setup my web server to return the published date in the Last Modified header ($when).

thanks very much for sorting this out w/ me.
carl
User avatar
John
Site Admin
Posts: 2597
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Reindex or Delete a modified page near-time

Post by John »

The error on line 69 should not have any serious consequences, and we'll get that resolved in a update.

You should use the Modified field instead of Visited, as it already stores the Last-Modified header value. You can add Modified to the index, and use it in the order by.
John Turnbull
Thunderstone Software
User avatar
mark
Site Admin
Posts: 5513
Joined: Tue Apr 25, 2000 6:56 pm

Reindex or Delete a modified page near-time

Post by mark »

Friendly note to everyone, including myself. Lets try to keep new threads in the proper groups. This discussion about Webinator 5 should have taken place in the Webinator group rather than the Webinator 2.5 group.
Post Reply