AnytoTx and gw hang

Post Reply
mikep
Posts: 5
Joined: Mon Sep 06, 2004 3:39 pm

AnytoTx and gw hang

Post by mikep »

I am indexing an internal web site with about 50,000 documents. I recently started having a problem where anytotx.exe and gw.exe sit there each taking 50% of the cpu. I have waited up to 3 hours for one to timeout and it never happens. If I kill anytotx.exe, then gw will continue walking after a few minutes. It is not always the same files that cause this problem. I will be upgrading to version 5.x soon, but at the moment I have to make this work with version 2.7 (running on Windows 2000).

Does anybody have any ideas?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

AnytoTx and gw hang

Post by mark »

Try doing it in smaller chunks. Use the -p option to limit to, maybe, 2000-5000 pages. Also use the -noindex option. Then keep running the command until no pages are fetched. Then run once with -index.
mikep
Posts: 5
Joined: Mon Sep 06, 2004 3:39 pm

AnytoTx and gw hang

Post by mikep »

How can I tell when no pages are fetched? Does "select count(*) from todo" return zero? Or do I compare the before and after values of "Select count(*) from html" ?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

AnytoTx and gw hang

Post by mark »

Either method should work. Except I would count(Url) rather than count(*).
Post Reply