max page size exceeded

Post Reply
User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

max page size exceeded

Post by Thunderstone »

User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

max page size exceeded

Post by Thunderstone »



Hi,
I am indexing my sie and it stopped once at 522 pages, and 420 pages
the 2nd time. I have 1500 pages of html on my site. I ran the
/array/www/tpl/htdocs/webinator/bin/gw http:www.tpl.toronto.on.ca command.
Now it won't even index my site. It now says

vrl 1# /array/www/tpl/htdocs/webinator/bin/gw http://www.tpl.toronto.on.ca
No database specified. Use the default (/array/www/tpl/htdocs/webinator/db)?
(y/n) default is y : y
You may use "-d-" to skip this question in the future.
Getting http://199.71.64.103/robots.txt...Got it...Ok.
Adding todo: http://www.tpl.toronto.on.ca/
http://www.tpl.toronto.on.ca/ is already in the database
Visited 0 pages total

Why isn't the indexer indexing?

Thank you in advance.

Will
TPL

________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com



User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

max page size exceeded

Post by Thunderstone »



Webinator will not index pages it already has.
If you want to start over use the -wipe option to clear the database.
If you have a good walk and want to refresh it, use -rewalk.

Some reasons gw might not get all of your pages:
They are not linked into the site.
Your site's robots.txt settings excludes them or pages that refer to them.
-j or -x options exclude them or pages that refer to them.
The truncated pages you were getting had more links beyond the
truncation point so they are never discovered.




Post Reply