wipe to-do table

Post Reply
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

wipe to-do table

Post by KMandalia »

I need to wipe out todo table since the webinator on refresh walk resumes the thread which I no longer wish to refresh.

Other than todo.tbl what other files need to be considered?

can I just execute one statement from command line like
<sql novars "delete * from todo"></sql>
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

wipe to-do table

Post by mark »

That would be
"delete from todo"
which effectively will erase all resume data. You may want to be more specific if you're walking more than one site.
"delete from todo where Url matches 'http://thesite%'"

But in general "refresh" will refresh everything in the database. You can't tell it to refresh one site and not another.
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

wipe to-do table

Post by KMandalia »

I am not quite sure how intelligent refresh decides to refresh data (since it is no longer all pages in database but only those that refresh considers may have changed) but even though I have 3 servers, it resumes on just one and doesn't seem to get out of it.

I have blocked the whole site that it is refreshing right now. It is still bringing more pages that match to what is available in database.
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

wipe to-do table

Post by KMandalia »

What if I put a base url in the exclude list like
http://www.somesite.com/*

eventually all entries in html and refs table should be deleted, right?

In other words how webinator treats web pages it can't update, do they sit in the tables or get wiped out eventually?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

wipe to-do table

Post by mark »

Changing the rules won't cause a page to be deleted from the database. Pages that it tries to refresh that no longer exist on the server will be deleted from the database. To remove pages from the database that still exist on the server you need to explictly delete them.
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

wipe to-do table

Post by KMandalia »

OK. So 404 errors will delete corresponding webpages but errors that denote 'request denied' or 'server unexpectedly closed connection' will not delete the webpages from the table, am I right?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

wipe to-do table

Post by mark »

All http responses 400 or greater will cause the page to be deleted as will most server errors that result in an incomplete or page.
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

wipe to-do table

Post by KMandalia »

Is deleting the Visited field for the url patterns I do not wish to update (ever) an effective way of altering to do table?

Today I wiped out the todo table but some of the urls I don't wish to refresh are listed in next urls to walk and are getting crawled right now.

Also, isn't wiping out todo table assure that refresh will start with base url list and try to update everything in turn?
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

wipe to-do table

Post by KMandalia »

got it.

thanks, john.
Post Reply