Delete URLs manually when all fails

edev
Posts: 127
Joined: Wed Sep 14, 2005 5:10 pm

Delete URLs manually when all fails

Post by edev »

Hi,

I have a profile as a base, and I combined another database with this profile using a script like this:
<DB=$sourceDB>
<SQL ROW "select * from html">
<DB=$targetDB>
<SQL NOVARS "insert into html values ($id, $Hash, $Size, $Visited, $Dlsecs, $Depth, $Url, $Title, $Body, $Keywords, $Description,
$Meta, $Catno, $Modified, $NextCheck, $Views, $Clicks, $CTR, $Pop, $MimeType, $Charset)">
</SQL>
</SQL>
</A>

Then I did a reindex so I would get a combination of two databases. The problem is I cannot delete URLs from the Edit/Delete URL option on the admin page. I want to delete all URLs matching a pattern like http://www.somesite.com/PrinterFriendly*, the goal is to get rid of any URL matching *PrinterFriendly*.
can I do this using a delete statement like
"texis -d E:\profile/db2 - s delete from html where URL like '*PrinterFriendly*'"?
Can this be done in a command window, or do I have a write a script to do this? Thanks!
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Delete URLs manually when all fails

Post by John »

You should be able to use the List/Edit URLs to delete. What happened when you tried?

To run from the command line you can use:

texis -d E:\profile\db2 -s "delete from html where URL matches 'http://www.somesite.com/PrinterFriendly%'"

although using the admin interface will also make sure that the other information about the profile is kept valid as well, such as indexes being built, counts updated, other tables update to match etc.
John Turnbull
Thunderstone Software
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Delete URLs manually when all fails

Post by mark »

Why can't you use list/edit urls to delete?

For simple sql statements you can execute them from the command line or in a script. You could use one of these
... where Url like 'printerfriendly'
... where Url matches '%PrinterFriendly%'
... where Url matches 'http://www.somesite.com/PrinterFriendly%'
edev
Posts: 127
Joined: Wed Sep 14, 2005 5:10 pm

Delete URLs manually when all fails

Post by edev »

Thank you John. When I try to delete the URLs from the admin screen I get this:

006 /webinator/dowalk(logerror) 2428: Can't write to KDBF file E:\CultureCaWalks\CultureCa_Site1b\db1\error.tbl: No file write permission in the function kdbf_free



006 /webinator/dowalk(logerror) 2428: Can't write to KDBF file E:\CultureCaWalks\CultureCa_Site1b\db1\xerrorurl.btr: No file write permission in the function kdbf_put



006 /webinator/dowalk(logerror) 2428: Can't write to KDBF file E:\CultureCaWalks\CultureCa_Site1b\db1\xerrorurl.btr: No file write permission in the function kdbf_put



006 /webinator/dowalk(logerror) 2428: Can't write to KDBF file E:\CultureCaWalks\CultureCa_Site1b\db1\xerrorurl.btr: No file write permission in the function kdbf_put



006 /webinator/dowalk(logerror) 2428: Can't write to KDBF file E:\CultureCaWalks\CultureCa_Site1b\db1\xerrorid.btr: No file write permission in the function kdbf_put



006 /webinator/dowalk(logerror) 2428: Can't write to KDBF file E:\CultureCaWalks\CultureCa_Site1b\db1\xerrorid.btr: No file write permission in the function kdbf_put



006 /webinator/dowalk(logerror) 2428: Can't write to KDBF file E:\CultureCaWalks\CultureCa_Site1b\db1\xerrorid.btr: No file write permission in the function kdbf_put



006 /webinator/dowalk(logerror) 2428: Cannot delete value (http://www.thecanadianencyclopedia.com/ ... 1ARTFET_E8) from index E:\CultureCaWalks\CultureCa_Site1b\db1\xerrorurl.btr



006 /webinator/dowalk(logerror) 2428: Cannot delete value (45802f29630) from index E:\CultureCaWalks\CultureCa_Site1b\db1\xerrorid.btr



006 /webinator/dowalk(logerror) 2428: Can't write to KDBF file E:\CultureCaWalks\CultureCa_Site1b\db1\error.tbl: No file write permission in the function kdbf_free



006 /webinator/dowalk(logerror) 2428: Cannot delete value (http://www.thecanadianencyclopedia.com/ ... 1ARTFET_E8) from index E:\CultureCaWalks\CultureCa_Site1b\db1\xerrorurl.btr



006 /webinator/dowalk(logerror) 2428: Cannot delete value (45802f29630) from index E:\CultureCaWalks\CultureCa_Site1b\db1\xerrorid.btr



006 /webinator/dowalk(logerror) 2428: Can't write to KDBF file E:\CultureCaWalks\CultureCa_Site1b\db1\error.tbl: No file write permission in the function kdbf_free



006 /webinator/dowalk(logerror) 2428: Cannot delete value (http://www.thecanadianencyclopedia.com/ ... 1ARTFET_E8) from index E:\CultureCaWalks\CultureCa_Site1b\db1\xerrorurl.btr



006 /webinator/dowalk(logerror) 2428: Cannot delete value (45802f29630) from index E:\CultureCaWalks\CultureCa_Site1b\db1\xerrorid.btr



006 /webinator/dowalk(logerror) 2428: Can't write to KDBF file E:\CultureCaWalks\CultureCa_Site1b\db1\error.tbl: No file write permission in the function kdbf_free

006 /webinator/dowalk(logerror) 2428: Cannot delete value (http://www.thecanadianencyclopedia.com/ ... 1ARTFET_E8) from index E:\CultureCaWalks\CultureCa_Site1b\db1\xerrorurl.btr
edev
Posts: 127
Joined: Wed Sep 14, 2005 5:10 pm

Delete URLs manually when all fails

Post by edev »

I tried to delete from a command line:
C:\Program Files\Thunderstone Software\Webinator>texis -d E:\CultureCaWalks\CultureCa_Site1b\db1 -s "delete from html where URL matches 'http://www.thecanadianencyclopedia.com/PrinterFriendly%'"
Texis Web Script (Vortex) Copyright (c) 1996-2004 Thunderstone - EPI, Inc.
Commercial Webinator Version 5.00.1085675754 20040527 (i686-intel-winnt-32-32)

115 Field URL non-existent
000 SQLPrepare() failed with -1 in the function prepntexis

...and got the above error. Help?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Delete URLs manually when all fails

Post by mark »

Your Webinator installation is seriously hosed and you're lucky if anything works. If you don't have perms to delete you certainly don't have them for crawling either.

John followed your example and called the field URL. It's called Url. Case matters.
edev
Posts: 127
Joined: Wed Sep 14, 2005 5:10 pm

Delete URLs manually when all fails

Post by edev »

Hi Mark,

I know the installation was somewhat non-standard...I didn't install it and we have already started a new server in hope it would solve the problem. But in the mean time I'll have to work with this server...from what you looked at could you recommend anything in the installation? thanks...
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Delete URLs manually when all fails

Post by mark »

Try making the database directory and all files in it full control for everyone. Make sure it's always accessed as the same user, keeping in mind that the webserver's CGI user is always accessing it for searches and web admin. Most likely anything you ran from the command line would have been as a different user and would either not work or would cause problems the web interface down the line.
edev
Posts: 127
Joined: Wed Sep 14, 2005 5:10 pm

Delete URLs manually when all fails

Post by edev »

I have an internet user set up for web interface of Webinator, and I am the local admin of the server (Windows 2003 server) so anything I run on command will be under the local admin rights.

I am now running a query on command

texis -d E:\CultureCaWalks\CultureCa_Site1b\db1 -s "delete from html where Url matches 'http://www.thecanadianencyclopedia.com/PrinterFriendly%'"

and now the command seems to freeze after this:
Texis Web Script (Vortex) Copyright (c) 1996-2004 Thunderstone - EPI, Inc.
Commercial Webinator Version 5.00.1085675754 20040527 (i686-intel-winnt-32-32)

id Hash Size Visited Dlsecs Depth Url Title Body Keywords Description Meta Catno Modified NextCheck Views Clicks CTR Pop MimeType Charset
------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+

It's been 30 minutes I'm still waiting for the results to come up. There are over 33,000 URLs that match this pattern and the database is about 420,000 Urls, is this command just taking too long?
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Delete URLs manually when all fails

Post by John »

Basically grant full control to Everyone and the IUSR_ user on the database directory and files.
John Turnbull
Thunderstone Software
Post Reply