Page 1 of 1

Reindex or Delete a modified page near-time

Posted: Thu Jul 29, 2004 3:07 pm
by carl
Hi,
How can I Reindex or Delete a single page from the command line without walking the whole site? I am using Webinator v5. dowalk seems to rely always on the NextCheck field, however, my documents can change from one minute to the next, and I'd like the reindexing or deletion to happen near-time rather than waiting 15 minutes.
thanks!
carl

Reindex or Delete a modified page near-time

Posted: Thu Jul 29, 2004 3:49 pm
by mark
For delete see http://www.thunderstone.com/texis/site/ ... e+Database

For keeping content fresher set your default refresh time and minimum refresh time to 1 minute and change the walk type to refresh.

Reindex or Delete a modified page near-time

Posted: Thu Jul 29, 2004 4:11 pm
by mark
Also under list/edit urls there's an "Update soon" option that will tell webinator to update that url at the beginning of the next refresh cycle.

Reindex or Delete a modified page near-time

Posted: Thu Jul 29, 2004 4:17 pm
by carl
thanks Mark. a couple of followup questions:

1. I noticed this in my vortex.log file:
115 2004-07-29 13:25:27 /usr/local/morph3/texis/scripts/webinator/dowalk:69: Field NextCheck non-existent
000 2004-07-29 13:25:27 /usr/local/morph3/texis/scripts/webinator/dowalk:69: SQLExecute() failed with -1 in the function execntexis

this is generated when i rewalk. do you know what could be wrong?

2. can i update or delete an individual page from the command line without walking the whole site? something like this:
./texis -r profile="X" top="http://xyz/doc5.html" /usr/local/morph3/texis/scripts/webinator/dowalk/refresh.txt

where doc5.html is the only page i want to update.

thanks!
carl

Reindex or Delete a modified page near-time

Posted: Thu Jul 29, 2004 4:32 pm
by John
1. That might happen if you are trying to refresh a Version 4 database.

2. You can do the refresh in two steps:

./texis -r profile="X" top="http://xyz/doc5.html" /usr/local/morph3/texis/scripts/webinator/dowalk/refreshnow.txt
./texis -r profile="X" /usr/local/morph3/texis/scripts/webinator/dowalk/refresh.txt

which will refresh doc5.html as well as any other pages scheduled to be refreshed.

Reindex or Delete a modified page near-time

Posted: Thu Jul 29, 2004 6:11 pm
by carl
John,
in regard to this error:
dowalk:69: Field NextCheck non-existent

it's strange, because i have started using Webinator as of version 5 and did not have texis previously.

[root@cblake bin]# ./texis -version
Texis Web Script (Vortex) Copyright (c) 1996-2004 Thunderstone - EPI, Inc.
Free Webinator Version 5.00.1086121238 20040601 (i686-unknown-linux2.4.9-64-32)

i did make a change in the dowalk script to insert into the Visited column the $when variable as opposed to 'now'. could this possibly be causing this error with the NextCheck field?

thanks!
carl

Reindex or Delete a modified page near-time

Posted: Thu Jul 29, 2004 9:27 pm
by John
Are you using the latest version of the scripts? What is on line 69?

Why change Visited to $when, as the Modified column stores $when?

Reindex or Delete a modified page near-time

Posted: Thu Jul 29, 2004 11:15 pm
by carl
this is line 69:
<SQL MAX=1 "select Url x from html where NextCheck < $times">
it seems to give an error on this line, but other places in dowalk where NextCheck is used seem to work just fine.

I changed Visited to $when, because my documents have a published date and I need to sort search results by publication date descending. so I've setup my web server to return the published date in the Last Modified header ($when).

thanks very much for sorting this out w/ me.
carl

Reindex or Delete a modified page near-time

Posted: Fri Jul 30, 2004 10:12 am
by John
The error on line 69 should not have any serious consequences, and we'll get that resolved in a update.

You should use the Modified field instead of Visited, as it already stores the Last-Modified header value. You can add Modified to the index, and use it in the order by.

Reindex or Delete a modified page near-time

Posted: Fri Jul 30, 2004 10:53 am
by mark
Friendly note to everyone, including myself. Lets try to keep new threads in the proper groups. This discussion about Webinator 5 should have taken place in the Webinator group rather than the Webinator 2.5 group.