Webinator not processing pages with invalid urls

Post Reply
jintov
Posts: 29
Joined: Fri Feb 22, 2002 5:43 am

Webinator not processing pages with invalid urls

Post by jintov »

Hi,
We have webinator installed on a apche server for walking the sit www.hindustantimes.com . The webinator is scheduled for walk with profile Archives at 3 a.m. everday.It is found that some Urls are not processed by the webinator. for eg the page http://www.hindustantimes.com/2003/Sep/ ... 310001.htm is not processed. While debugging it is found that this page contains some invalid URls because of which it is not processing the page itself. HOw can we make it possible that eventhough some invalid urls are there the page be processed?

thanks
sandeep
jintov
Posts: 29
Joined: Fri Feb 22, 2002 5:43 am

Webinator not processing pages with invalid urls

Post by jintov »

invalid urls can be either urls for which page cannot be fetched. the url i specified above as an example can be fetched. i will go through the debugging process i followed.
the http://www.hindustantimes.com/2003/Sep/ ... 310001.htm was not getting processed. I started removing some part of the html at a time and walking the site.
At one point i found that when i removed an url from the page the page was processed. this is the url i sited as an invalid url.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Webinator not processing pages with invalid urls

Post by mark »

Turn verbosity up to 4. Do a new walk. In the error report or list/edit urls find that url and see why it was rejected.
jintov
Posts: 29
Joined: Fri Feb 22, 2002 5:43 am

Webinator not processing pages with invalid urls

Post by jintov »

See there are some urls (detail pages) have to be removed from the web site after some time but the links will continue to be there in the home page. Looks like webinator does not process further once it comes across a page which does not exist but the link to that page is active.
Pl help as this is very very urgent.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Webinator not processing pages with invalid urls

Post by mark »

Webinator does not stop just because there are links to non-existant pages. Make sure you're not using modified scripts. Get the current dowalk and webinatoradmin scripts from the webinator examples page, http://www.thunderstone.com/texis/site/ ... ample.html

Turn verbosity up to 4. Do a new walk. In the error report or list/edit urls find that url and see why it was rejected.
Post Reply