Indexing only 1 page

ballen0
Posts: 3
Joined: Thu Jan 04, 2007 12:51 pm

Indexing only 1 page

Post by ballen0 »

Hi! I made a new profile (copy of the "default") and it's only indexing the homepage. Other profiles are (seemingly) identical and work fine. I have no idea what the problem could be.

Any suggestions?
User avatar
jason112
Site Admin
Posts: 347
Joined: Tue Oct 26, 2004 5:35 pm

Indexing only 1 page

Post by jason112 »

Are there any children listed for that homepage in list/edit URLs?

Is the site publicly accessible? It's possible that
links could be interpreted incorrectly, or there could be
a "robots.txt" that tells webinator not to crawl it (which
it obeys by default).
ballen0
Posts: 3
Joined: Thu Jan 04, 2007 12:51 pm

Indexing only 1 page

Post by ballen0 »

>>> Are there any children listed for that homepage in list/edit URLs?

Yes, there's a whole bunch, but they are all unlinked (not in the database).

---------------------------------

>>> Is the site publicly accessible?

No, there is an .htaccess and .htpasswd file, but I have filled in the auth info in the "Login" fields.

---------------------------------

>>> It's possible that
links could be interpreted incorrectly, or there could be a "robots.txt" that tells webinator not to crawl it (which it obeys by default)?

Yes, there's a robots.txt file that says "follow none" but we have the application set not to listen to the robots.txt file.


<STUMPED>
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Indexing only 1 page

Post by mark »

Turn verbosity up to 4. Run a mode new walk. Go to list/edit urls again and click children. Look to the right of all of the unlinked child urls to see why they were rejected.
ballen0
Posts: 3
Joined: Thu Jan 04, 2007 12:51 pm

Indexing only 1 page

Post by ballen0 »

Done. It gives no reasons. :(
User avatar
John
Site Admin
Posts: 2623
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH

Indexing only 1 page

Post by John »

Verbosity 4 should show the reasons. Typical reasons for not following the links are they have an extension not in the extension list, they match an exclude or robots.txt, query stripping, or they are off-site or in a different directory than the homepage. Does walk status show any reason for stopping?
John Turnbull
Thunderstone Software
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Indexing only 1 page

Post by mark »

No reason probably indicates that the walk stopped prematurely. Possible reasons: user hit stop or pause, license limit reached, selected max pages, bytes, or depth limit reached. Check the walk status as John suggested.