Using log to check for broken links

Post Reply
b.sims
Posts: 99
Joined: Fri Oct 26, 2001 10:40 am

Using log to check for broken links

Post by b.sims »

I have read the topic on doing this. However, I would like to ask if it is possible to differentiate between broken URLs passed to Webinator (ie those that I have placed in my .lst file) and those it spiders to automatically.

Also, is it possible to directly view the error table? I know that it is possible to port it to HTML, but ideally I would prefer to have just the broken URLs as text or in a MySql database.
User avatar
mark
Site Admin
Posts: 5515
Joined: Tue Apr 25, 2000 6:56 pm

Using log to check for broken links

Post by mark »

Urls you specify have a Depth of 0 in the html table. You can look them up there.
"select Depth from html where Url=..."

You can dump the error table from gw with -s or from a vortex script using <sql>.
"select * from error"
b.sims
Posts: 99
Joined: Fri Oct 26, 2001 10:40 am

Using log to check for broken links

Post by b.sims »

Thanks for the tips;

This works, but gives everything, not just broken links. If I want to isolate broken links then I need (I think?) to find entries that have a depth of 0 and exist in the errors table.

I can dump the table to my DOS window, but can you tell me how to get it into a .txt file and so to databases etc.
User avatar
mark
Site Admin
Posts: 5515
Joined: Tue Apr 25, 2000 6:56 pm

Using log to check for broken links

Post by mark »

Duh, sorry, i was answering before thinking. Not-found pages will not be in html table and therefore can not be looked up there. The urls that you specify will be in the options table though as long as you're not using -O.

gw -s "select options.String Url,error.Reason Reason
from error,options
where options.Name='URL' and options.String='http://'+error.Url" >errors.txt
b.sims
Posts: 99
Joined: Fri Oct 26, 2001 10:40 am

Using log to check for broken links

Post by b.sims »

Thanks, I now have a list of the links which didn't work. One last question before I stop bugging you: can I get webinator to tell me where it found the links which were broken, ie on which page of my website. Both FrontPage and Dreamweaver have features which do this but my site is 15,000 pages and they cannot handle it.
User avatar
mark
Site Admin
Posts: 5515
Joined: Tue Apr 25, 2000 6:56 pm

Using log to check for broken links

Post by mark »

b.sims
Posts: 99
Joined: Fri Oct 26, 2001 10:40 am

Using log to check for broken links

Post by b.sims »

yes, that was what I was aiming at in previous parts of the conversation. This question is actually in response to a different request from someone in my team, wanting to know if the new software will allow us to test our own site for broken links, internal and external.

For example I give the crawler gw http://www.mysite.com/thesubdirectory

It then reads all the pages, follows links, indexes etc. I am able to access the error messages produced when it tries to view a broken link through the error table and log. But, I would like to know on which page of my site the broken link was on.

I'm just trying to get out of having to pay for Webinator to do our search engine and something else to do our link checking and maintenance.
b.sims
Posts: 99
Joined: Fri Oct 26, 2001 10:40 am

Using log to check for broken links

Post by b.sims »

Sorry, just ignore that and thanks for the help. I are RTFMed a bit closer and will give it another go. You guys must get sick of this sort of thing...
Post Reply