Login Problems

Post Reply
ian.thomas
Posts: 2
Joined: Fri Nov 12, 2004 6:33 am

Login Problems

Post by ian.thomas »

Hi,

At the moment we are currently using Webinator 4.4.8 to provide the search functionality to a site. We have a text file which is automatically generated every morning which webinator points to. The actual site requires a login and password to create a session to perform the crawl.

In the text file the first line includes the login and password:

e.g.
http://www.site.net/page?loginuser=user ... d=password.

When webinator connects to this page the site opens the session and all following pages are included in the search.

The problem I'm having is that the URL which I use to log into the system with is actually included in the search database. Meaning that if a user searches for something they may get a url result which actually includes the login and password string.

I'm not too experienced with webinator so was wondering if there is anyway to excluded this result from the search? I've tried adding the URL to the exclude box in the walk settings but for some reason it still manages to slip through and be included in the search.

I've also tried to include the username and password in the login details in the walk setting, but I'm not too sure this'll work as no results are collected as it gets stuck being verified by the system.

Has anyone got any ideas at all? for a very novice webinator user.

TIA for any information anyone can give me on this problem!

Ian
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Login Problems

Post by John »

The ideal solution would be to add a meta robots tag to
the login page, e.g.

<meta name="robots" content="noindex,follow">

which will tell the engine to not index that page. Other solutions would include editing the script to fetch that page first so it would not need to be listed, or to not store that page.
John Turnbull
Thunderstone Software
ian.thomas
Posts: 2
Joined: Fri Nov 12, 2004 6:33 am

Login Problems

Post by ian.thomas »

Hi John,

Thanks for the reply!

I'll try and give the "<meta name="robots" content="noindex,follow">" a go. I'm not too well up on editing scripts so this'll be something I'll have to look into if this fails.

Thanks again! I'll get back to you on the results.

EDIT:
Just finished crawling again... it now works! Thanks for that. The page which is being used to login the user in is now excluded from the database.

Cheers!
Ian
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Login Problems

Post by mark »

Good to hear.

Please post followups as new messages rather than editing earlier messages. Editing it only intended for fixing typos and such. Edited messages also don't cause notifications for those that subscribe to the board.
Post Reply