can't delete from the html table

User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

can't delete from the html table

Post by mark »

Yes.
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

can't delete from the html table

Post by KMandalia »

Here is what I have done for the recent deletion of about 42,000 pages.

First I made a copy of live db for the profile we are using and then renamed it to 'mydb'. I then deleted the syslocks.seq file from 'mydb'. Live db is untouched thus far.

Q: When I did the deletion in mydb, after deletion the size of xrefsurl,html,refs table increased by more than 100 MB. What could be the reason?

Then I ran this script from the command line:

<script language=vortex>
<timeout=-1></timeout>
<a name=main public>
<$mydb="E:\Program Files\Thunderstone Software\Webinator\mydb">
<$deletethis="http://www.forrester.com/find?SortType%">
<sql db=$mydb novars "delete from html where Url matches $deletethis"></sql>
<sql db=$mydb novars "delete from refs where Url matches $deletethis"></sql>
</a>
</script>

Now, am I right about the following sequence. (I have the latest 5.1.3 dowalk script)

0. I will take a backup of the current live db and store the entire directory somewhere.

1. I will replace the live db with 'mydb' without replacing the syslocks.seq file.

2. I will then run reindex (or should it be remakeindex?) from the command line for the current profile.

I got it messed up last time so I want to be very cautious this time.

Let me know and then I will perform the steps I outlined.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

can't delete from the html table

Post by mark »

0. Rather than making more copies and such just rename the live db directory. Then rename mydb to the live db. That will make the switch nearly atomic and not get into weird issues if someone tries a search while you've only got half of the db copied.
1. The file is SYSLOCKS not SYSLOCKS.SEQ. It may or may not exist depending on platform and circumstances.
2. Use the new "updateindex" instead of "reindex", assuming there were no errors while running your script and you kept more pages than you deleted.
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

can't delete from the html table

Post by KMandalia »

I deleted syslocks.seq and it got recreated. Do you think my deleting it could have any adverse impact? I don't have SYSLOCKS file (I am on windows 2003 server)

Say I have the mydb directory somewhere in INSTALLDIR, can I run

texis PATHTOMYDB ttyverbose=4 dowalk/updateindex.txt from the command line or do I have to have a profilename specified?

Any idea why the size of refs and html tables increased by more than 100 MB?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

can't delete from the html table

Post by mark »

Deleting SYSLOCKS.SEQ shouldn't cause you any problem.

To manage the database with dowalk you have to use a profile and the database has to be in the expected place and name. You can work with any database with scripts you write yourself. If you wanted to futz a little you could create a copy of your profile settings put your mydb in place for that profile's database and use that profile to update the index. Then the database will be all ready to be made live for the live profile.

There's an index of deleted/free space within the table itself. That grows when you delete a lot of data without inserting any.
Post Reply