Retrieving entire list of 404's

Post Reply
mark_thomas
Posts: 9
Joined: Thu Nov 02, 2006 1:57 pm

Retrieving entire list of 404's

Post by mark_thomas »

We would like to retrieve a list of all broken links from the search appliance. Is there any way to do that? The Show Errors list is only a partial list, and fills up with other errors, when we are interested in 404 errors only.
User avatar
John
Site Admin
Posts: 2597
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Retrieving entire list of 404's

Post by John »

If you click on the "Basic Settings" link for the profile you are interested in, and then replace basic.html with tsverrors.csv you can download an Excel compatible file with all the errors. You would then need to filter on the reason for the code 404.

You may need to use Excels "Convert Text to Columns" and specify TAB Delimited.
John Turnbull
Thunderstone Software
mark_thomas
Posts: 9
Joined: Thu Nov 02, 2006 1:57 pm

Retrieving entire list of 404's

Post by mark_thomas »

Is there another, more direct way of retrieving the list? I'm trying to script the retrieval of the document, and it seems to be using some browser trickery to initiate the download, rather than just sending the document itself in the HTTP response.
User avatar
mark
Site Admin
Posts: 5514
Joined: Tue Apr 25, 2000 6:56 pm

Retrieving entire list of 404's

Post by mark »

The trickery is a cookie. You have to be logged into the admin interface to download the errors.
User avatar
John
Site Admin
Posts: 2597
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Retrieving entire list of 404's

Post by John »

It should send the document directly in the HTTP response, no trickery involved, although it does require the login cookie.
John Turnbull
Thunderstone Software
mark_thomas
Posts: 9
Joined: Thu Nov 02, 2006 1:57 pm

Retrieving entire list of 404's

Post by mark_thomas »

I can get in to the Basic Settings page via the script, so I'm pretty sure it is handling cookies.

But the fetch of the tsverrors.csv file isn't working... maybe because there's no actual link to follow for it, so it's not sending the cookie... I'll look into it further and see if I can get it to work.
mark_thomas
Posts: 9
Joined: Thu Nov 02, 2006 1:57 pm

Retrieving entire list of 404's

Post by mark_thomas »

Aha! The TSA is not sending a cookie over HTTP headers the normal way. It is embedding it in the page as a meta tag. That qualifies as browser trickery to me :)
mark_thomas
Posts: 9
Joined: Thu Nov 02, 2006 1:57 pm

Retrieving entire list of 404's

Post by mark_thomas »

I got it working. I now have a script that will automatically extract the errors.
michel.weber
Posts: 256
Joined: Sat Oct 08, 2005 12:40 pm

Retrieving entire list of 404's

Post by michel.weber »

Are there any other undocumented features like this?
Post Reply