Showing links that were disallowed

Post Reply
neil.munro
Posts: 22
Joined: Fri Nov 09, 2001 1:13 am

Showing links that were disallowed

Post by neil.munro »

Is it possible to find out which links have been rejected by the walk(rewalk due to disallowed protocol/MIME type or anything like that.
When I walk a database with any degree of verbosity, I can see links that are disallowed, but I want to find out what they are...?
(Besides going to that page and looking at the HTML code...)
...regards.
User avatar
mark
Site Admin
Posts: 5498
Joined: Tue Apr 25, 2000 6:56 pm

Showing links that were disallowed

Post by mark »

neil.munro
Posts: 22
Joined: Fri Nov 09, 2001 1:13 am

Showing links that were disallowed

Post by neil.munro »

Thanks fo that...however, this report (as far as I can tell) doesn't show what I want. eg this is an extract from the walking of the database:
http://www.mywebsite.com/some-page.htm
. ........
32: TotLinks: 858, Links: 34/ 18, Good: 13, New: 1 Disallowed path(x/)
32: TotLinks: 858, Links: 34/ 6, Good: 23, New: 1 Disallowed protocol
32: TotLinks: 858, Links: 34/ 5, Good: 23, New: 1 Disallowed protocol
32: TotLinks: 858, Links: 34/ 4, Good: 23, New: 1 Disallowed protocol
32: TotLinks: 858, Links: 34/ 3, Good: 23, New: 1 Disallowed protocol
32: TotLinks: 858, Links: 34/ 2, Good: 23, New: 1 Disallowed protocol
32: TotLinks: 858, Links: 34/ 1, Good: 23, New: 1 Disallowed protocol

Sure, I can figure out the disallowd path..that is my doing...But the disallowed protocol? How can I find out what protocol it was trying to retrieve?
Is this what you meant by the error report? or is there a file somewhere?

The above was run with a verbosity of 4 or greater.
In the error table, there is only one error about the robots.txt file...

regards,
User avatar
mark
Site Admin
Posts: 5498
Joined: Tue Apr 25, 2000 6:56 pm

Showing links that were disallowed

Post by mark »

Sorry, I was referring to version 4. In version 2 you need to turn verbosity up to 7 (i usually just crank it all the way to 9 when in doubt). It will then print the link it's working on first so you can correlate the message to the link. Version 2 will not record any of those in the error table.
Post Reply