Crawl an HTTPS site that requires a password POST?

Post Reply
rgwin0
Posts: 15
Joined: Fri May 04, 2007 1:45 pm

Crawl an HTTPS site that requires a password POST?

Post by rgwin0 »

I've got a site that requires a POST password, and the webinator crawl works fine when I set a Custom Primer URL with the http+post pseudo-protocol. However, now the site needs to be hosted on HTTPS. I tried changing "http+post" to "https+post" to see if it would work, but it didn't.

How can I crawl an HTTPS site that requires a password POST?

Thanks,
rob
User avatar
John
Site Admin
Posts: 2597
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Crawl an HTTPS site that requires a password POST?

Post by John »

Which version of Webinator do you have? That should work with current versions.
John Turnbull
Thunderstone Software
User avatar
jason112
Site Admin
Posts: 347
Joined: Tue Oct 26, 2004 5:35 pm

Crawl an HTTPS site that requires a password POST?

Post by jason112 »

Short version: it's a bug that will be fixed, for now you can enter a "Base URL MM Query" of /https://www.yoursite.com
and it should work.


All Primers aren't used for every baseURL; primers only fire if their "Base URL MM Query" matches the base url. If a query isn't specified, one is generated that matches the hostname.

It looks like there's a bug in the logic for generating the matching expression that doesn't handle https-post properly. This will be fixed in an update, but in the meantime you can specify the Base URL query manually.
rgwin0
Posts: 15
Joined: Fri May 04, 2007 1:45 pm

Crawl an HTTPS site that requires a password POST?

Post by rgwin0 »

Jason112, that did the trick, thanks!
Post Reply