Page 1 of 1

AnytoTx and gw hang

Posted: Mon Sep 06, 2004 4:48 pm
by mikep
I am indexing an internal web site with about 50,000 documents. I recently started having a problem where anytotx.exe and gw.exe sit there each taking 50% of the cpu. I have waited up to 3 hours for one to timeout and it never happens. If I kill anytotx.exe, then gw will continue walking after a few minutes. It is not always the same files that cause this problem. I will be upgrading to version 5.x soon, but at the moment I have to make this work with version 2.7 (running on Windows 2000).

Does anybody have any ideas?

AnytoTx and gw hang

Posted: Mon Sep 06, 2004 8:48 pm
by mark
Try doing it in smaller chunks. Use the -p option to limit to, maybe, 2000-5000 pages. Also use the -noindex option. Then keep running the command until no pages are fetched. Then run once with -index.

AnytoTx and gw hang

Posted: Mon Sep 06, 2004 9:12 pm
by mikep
How can I tell when no pages are fetched? Does "select count(*) from todo" return zero? Or do I compare the before and after values of "Select count(*) from html" ?

AnytoTx and gw hang

Posted: Tue Sep 07, 2004 10:27 am
by mark
Either method should work. Except I would count(Url) rather than count(*).