handshake error when attempting to index https content

Post Reply
henry.legedza
Posts: 8
Joined: Wed Aug 29, 2012 9:24 pm

handshake error when attempting to index https content

Post by henry.legedza »

Hi there,

Curently whenever we try indexing content on a https site we continually get the following error:

Server error: https://siteurl returned code 500 (handshakefailed)

I have Webinator set so that it recognises the https protocol.

Any suggestions as to what to check for? I've never come across this error.

Thanks
Henry
User avatar
mark
Site Admin
Posts: 5513
Joined: Tue Apr 25, 2000 6:56 pm

handshake error when attempting to index https content

Post by mark »

Code 500 is the server saying it had an unknown error attempting to deliver the requested content. Often caused by a server app breaking. If possible check the web server's error logs to see if it provides any more detail. Otherwise try setting Webinator's User Agent to match what a browser would send. Sometimes cgi/asp/etc. apps crash when given unexpected inputs.
henry.legedza
Posts: 8
Joined: Wed Aug 29, 2012 9:24 pm

handshake error when attempting to index https content

Post by henry.legedza »

I checked the IIS error logs through event viewer logs and couldn't find anything obvious.

Currently our User Agent is Mozilla/5.0 (compatible; T-H-U-N-D-E-R-S-T-O-N-E)

The same web server quite happily delivers http content for indexing - it's just https.
henry.legedza
Posts: 8
Joined: Wed Aug 29, 2012 9:24 pm

handshake error when attempting to index https content

Post by henry.legedza »

I looked through the vortex log and noticed that before each attempt at reindxing the https content this message was logged:
"User Public has been added without a password"

Does this mean anything?
User avatar
jason112
Site Admin
Posts: 347
Joined: Tue Oct 26, 2004 5:35 pm

handshake error when attempting to index https content

Post by jason112 »

That's normal, it's logged every time a new database is created. That happens whenever a profile is created, or a "New" walk is started.

Are you able to crawl any other https sites, or do all of them exhibit the same behavior?
henry.legedza
Posts: 8
Joined: Wed Aug 29, 2012 9:24 pm

handshake error when attempting to index https content

Post by henry.legedza »

I tried an external https site with no authentication required and got the same handshake error.

The Webinator box does need to go out through a proxy server and the handshake issue seemed to have started around about the time it was moved to this new proxy server.

Is there anything there we might need to look at?
User avatar
jason112
Site Admin
Posts: 347
Joined: Tue Oct 26, 2004 5:35 pm

handshake error when attempting to index https content

Post by jason112 »

Current Webinator cannot properly connect to https sites through a proxy. The new proxy server is likely recognizing this and providing the 500 error.

It's theoretically possible for the proxy server to allow this by acting as a "man in the middle", but this proxy server either can't or is choosing not to.

This functionality will be added in a future release of Webinator.
henry.legedza
Posts: 8
Joined: Wed Aug 29, 2012 9:24 pm

handshake error when attempting to index https content

Post by henry.legedza »

You suggested: "It's theoretically possible for the proxy server to allow this by acting as a "man in the middle", but this proxy server either can't or is choosing not to."

Are there things we can check on the proxy to determine whether it can't or is not set to do so???
henry.legedza
Posts: 8
Joined: Wed Aug 29, 2012 9:24 pm

handshake error when attempting to index https content

Post by henry.legedza »

I have spoken to our IT people and this was their response:

The proxy is able to handle both HTTP and HTTPS proxy requests. It is already configured to allow all traffic from our search server to be proxied without needing authentication. It is also configured to bypass the SSL scanners etc for the search server.

The central proxy is already configured to bypass the majority of the auth/filtering for the search server.
User avatar
Kai
Site Admin
Posts: 1271
Joined: Tue Apr 25, 2000 1:27 pm

handshake error when attempting to index https content

Post by Kai »

To clarify, when attempting to make an https/SSL connection through a proxy, Webinator will issue a `GET https://somesite/path' request to the proxy -- the same as it would with an http URL. It does not yet support the `CONNECT' method to tell the proxy to make a pass-through TCP connection (allowing a seamless SSL connection from Webinator to the origin server). A future release of Webinator will support `CONNECT'.

Perhaps your proxy is rejecting `GET https://...', expecting a `CONNECT' instead for https content? When you say http (unsecure) content works, I assume it also work through the same proxy? That would indicate to me that the proxy might be expecting `CONNECT'.
Post Reply