I/O error

Post Reply
cpm18
Posts: 35
Joined: Mon Apr 13, 2009 3:21 pm

I/O error

Post by cpm18 »

I've been running into an error that I can't quite figure out. When fetching https://www.zomato.com/, the fetch fails with the following message...
I/O error: An established connection was aborted by the software in your host machine in the function htbuf_readnblk

The script ultimately reports it as an SSL error. Note that this was done using Texis 6. I am aware that there was some new features added with v7 related to SSL so I also tried the same fetch on a Texis 7 machine. That fetch also fails although with a different error...
Timeout reading data from www.zomato.com:443 in the function htbuf_readnblk

This one ends up being reported as a connection timeout.

The unexpected observation I ended up making was that these errors only seem to occur when I set a typical browser as the user agent. Leaving the user agent blank results in the fetch succeeding on either version of Texis.

Is there a configuration I am missing as to why the fetches fail when a browser agent is set or is this simply a problem with the site I am fetching?
User avatar
Kai
Site Admin
Posts: 1271
Joined: Tue Apr 25, 2000 1:27 pm

I/O error

Post by Kai »

Timeout and I/O error are generally connectivity issues, i.e. caused by something beyond the local machine and its settings (e.g. network, firewall, remote machine etc.). If you notice a correlation between user agent and these errors, I suspect the remote machine or network is responding differently based on user agent. Perhaps they change content based on it, and there is an issue with that mechanism.

In any event, yes, I suspect it's a problem with the remote site.
cpm18
Posts: 35
Joined: Mon Apr 13, 2009 3:21 pm

I/O error

Post by cpm18 »

I've recently been revisiting this issue and am wondering if this site using HTTP/2.0 is causing an issue. Is Vortex capable of handling sites using HTTP/2.0?

A sample URL that is failing would be https://www.zomato.com/ncr/bukhara-itc- ... hi/reviews. I can load this page in my browser and I find that I can reproduce the request in curl when the following parameters are set --2.0 and --compressed.
cpm18
Posts: 35
Joined: Mon Apr 13, 2009 3:21 pm

I/O error

Post by cpm18 »

I should probably also mention that I am using version 7.03.1434496242 20150616 (x86_64-unknown-winnt-64-64)
User avatar
Kai
Site Admin
Posts: 1271
Joined: Tue Apr 25, 2000 1:27 pm

I/O error

Post by Kai »

Texis (including your version) does not currently support HTTP/2.0. However, that URL is fetchable with that version of Texis from our end, using HTTP/1.1 (which is what Vortex would use). So I suspect it's still some connectivity issue with the remote host.

Does the fetch still fail only when a particular user agent is set? If so, what is that agent string?
cpm18
Posts: 35
Joined: Mon Apr 13, 2009 3:21 pm

I/O error

Post by cpm18 »

I was eventually able to reliably fetch the page. The site appears to be very particular about what headers and user agent is being passed. It appears HTTP/2.0 was not the issue here.
Post Reply