Page 1 of 2

How to simulate browser?

Posted: Fri Jul 02, 2010 3:42 pm
by cpm18
I'm using Vortex to fetch a page.
http://forums.pinnaclesys.com/forums/default.aspx

When I view this page with a browser, it loads the page as expected. But my vortex fetch redirects to an error page.
http://forums.pinnaclesys.com/error.htm ... fault.aspx

I considered the site was blocking me but after trying again with a proxy, I still get redirected to the error. So I am trying to determine if there is some way I can get the page to fetch with vortex in the same manner it does with a browser?

How to simulate browser?

Posted: Fri Jul 02, 2010 4:44 pm
by jason112
It's probably going based off the "User Agent" http header. Check out <urlcp useragent> to change it to something of a standard browser.

How to simulate browser?

Posted: Wed Jul 07, 2010 11:08 am
by cpm18
That's what I figured, but changing the useragent has not made any difference.

Somehow the site is able to redirect my fetch to the error page but my browser goes to the correct page.

How to simulate browser?

Posted: Wed Jul 07, 2010 1:42 pm
by mark
This makes it work for me:
<urlcp header "Accept-Language" "en-us,en;q=0.5">


Basically, sniff what a browser is sending and set urlcp options one at a time to match that to find what the server is looking for.

How to simulate browser?

Posted: Mon Jan 17, 2011 1:34 pm
by cpm18
Instead of starting a new thread, I figure I should throw this in with this one since it's a similar problem.

I'm trying to fetch from www.boatingforumz.com but the site appears to have some kind of detection process that is giving me an alternate output than my browser gets. The site looks normal when viewed from a browser but my fetch always has a $ret of...

hello. This is the page you requestd.

I've been trying to get vortex to simulate the fetch from my browser. I've tried different user agents, tried matching the header for accept-language, tried switching encoding, played around with cookies, and also tried switching http versions between 1.0 and 1.1 without any change.

Perhaps I just haven't got the right combination of settings but for whatever reason this one is stubborn.

How to simulate browser?

Posted: Mon Jan 17, 2011 2:42 pm
by mark
Don't know what user agents you've tried but this works for me:
"Mozilla/4.0 (compatible; T-H-U-N-D-E-R-S-T-O-N-E)"

The built-in default of
""Mozilla/2.0 (compatible; T-H-U-N-D-E-R-S-T-O-N-E)"
seems to cause the server to return the alternate content.

How to simulate browser?

Posted: Mon Jan 17, 2011 5:34 pm
by cpm18
I wonder if there is more to it? I tried with that user agent as well as...
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 3.5.21022)

And I always get the useless data. I must be lacking or have some other setting which is problematic. I'll have to keep trying some other things.

How to simulate browser?

Posted: Mon Jan 17, 2011 5:36 pm
by cpm18
Actually, nevermind. I was able to get it working with the user agent listed above by Mark.

How to simulate browser?

Posted: Mon Jan 17, 2011 5:50 pm
by mark
What's your Texis version? I tried 6.00.1272471692 20100428 and 5.01.1260557077 20091211.

There's also a possibility, if you've been hitting them a lot, that you've been blacklisted. Try doing the fetch from a different IP or network or through a proxy to see if that changes anything.

Try a network sniffer to compare what your browser is sending vs. what Texis is sending.

How to simulate browser?

Posted: Fri Feb 04, 2011 5:15 pm
by cpm18
I don't think this is a user agent issue but is somewhat related to the previous questions.

I'm trying to fetch this site http://www.quora.com/Dell. When I view the page source from my browser there is one part of the code which is not consistant with what my fetches get.

With the browser, I see...
<span class="timestamp">12:30am on Friday

When I do a fetch of the page, I get...
<span class="datetime" id="__w2_kO6E1QA_datespan">Insert a dynamic date here

There is then a whole bunch of javascript later on the page which is probably used to generate the date.

So I am wondering is vortex able to get the same source html that the browser is getting or will I have to find a way to execute the javascript through vortex to get the dates which appear in the browser html?