Thunderstone and Drupal

Post Reply
rjhertzberg
Posts: 23
Joined: Wed Sep 10, 2008 10:38 am

Thunderstone and Drupal

Post by rjhertzberg »

Does anyone out there have any experience walking a Drupal site that requires authentication?

We are attempting to do this but we keep running into road blocks. At first, we thought we were having an issue authenticating against our Single Sign-On (CAS), eventhough we are able to walk other sites hooked up to our CAS installation. However, even with CAS out of the picture, and just using the default Drupal authentication, we still can't seem to walk the site. It appears as though Drupal thinks the TS box is not logged in. This is the same type behavior that was exhibited when we had CAS hooked up.

Has anyone else experienced this? If so, what did you do to work around the issue? Also, if you are crawling Drupal and haven't had any issues, I'd appreciate knowing that as well.

Thanks in Advance!!

-Russell
User avatar
mark
Site Admin
Posts: 5513
Joined: Tue Apr 25, 2000 6:56 pm

Thunderstone and Drupal

Post by mark »

Assuming a "Base URL" like
http://www.mysite.com/drupal/
Replace the "Exclusions" with
/drupal/logout
/drupal/user
and any other areas you don't want indexed.
Set "Strip Queries" to N.
Set "Primer Type" to Custom.
Set "Custom Primer URL" to
http://www.mysite.com/drupal/
Set "Custom Primer Variables" to
name=MYLOGIN&pass=MYPASSWORD
where "MYLOGIN" and "MYPASSWORD" are your login and password respectively. Be sure to URL encode those values. eg. use %20 for space etc. ...pass=MY%20PASSWORD
rjhertzberg
Posts: 23
Joined: Wed Sep 10, 2008 10:38 am

Thunderstone and Drupal

Post by rjhertzberg »

Mark - Thanks for your reply. Unfortunately, for me, this is how we have our walk set up but it still doesn't work. I am assuming from your post that you have gotten this to work on your end, correct?

From what I can tell, it looks like we get logged in during the Primer URL call, but then once it does the walk, Drupal doesn't think we're logged in.

The whole thing is very strange because the site works fine while browsing. And, we're walking several other sites that require authentication without any issues. This seems to be the only one we are having trouble with. The only time I was able to walk the site is when we had authentication turned off. If you have any other suggestions on what I might look at, I would certainly appreciate it!!

Thanks,
Russell
User avatar
jason112
Site Admin
Posts: 347
Joined: Tue Oct 26, 2004 5:35 pm

Thunderstone and Drupal

Post by jason112 »

Was the ".maxdaemon.com" cookie thing something that was necessary for this to work? I'd thought that would be a texis fix.
User avatar
John
Site Admin
Posts: 2595
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Thunderstone and Drupal

Post by John »

Does that happen with all pages, including the first one fetched? Do you have any other pages that might have a logout effect if fetched?
John Turnbull
Thunderstone Software
User avatar
mark
Site Admin
Posts: 5513
Joined: Tue Apr 25, 2000 6:56 pm

Thunderstone and Drupal

Post by mark »

Make sure you're using "www.mysite.com", or whatever it's called, not "mysite.com" in your base url and primer.

If that doesn't do it you'll probably have to supply a lot more detail about your crawl settings and site. If you don't want to do that here you can open a support ticket.
Post Reply