Not able to log in to Webinator

erling.ervik
Posts: 11
Joined: Fri Mar 20, 2009 4:36 am

Not able to log in to Webinator

Post by erling.ervik »

I have make sure that I have a scripts directory and that is has execute permissions on scripts and executables.

I run the installer, it suggested the correct directories, and everything worked as expected. But I can not start the admin console.

Error message: The page cannot be found
The URL that Webinator tried after install is:
http://kmyhre-inett.inett.utvikling.no/ ... tor/dowalk

I think this URL is wrong so I have tried:
http://localhost/texis/webinator/dowalk

http://localhost/
will bring up the normal website

We are just testing the free version so far, and have not done any purchase yet.

We are using Windows 2003 server on VMware virtual machine, running on Vista 64bit.

The install was run for use of ISAPI. I was not able to find any errors in the event log mentioning texis or Webinator.
User avatar
jason112
Site Admin
Posts: 347
Joined: Tue Oct 26, 2004 5:35 pm

Not able to log in to Webinator

Post by jason112 »

The ISAPI Proxy Module should make a single entry in the event log on successful startup, so I'm thinking it might not be present.

In the IIS Manager, if you look at the root directory of the website, is there a virtual directory "texis" beneath it?

If not, please follow the "manual setup" steps listed in the manual for adding the ISAPI Proxy Module to IIS:
http://www.thunderstone.com/site/webina ... later.html
erling.ervik
Posts: 11
Joined: Fri Mar 20, 2009 4:36 am

Not able to log in to Webinator

Post by erling.ervik »

Hi, and thanks for your answerer.

I have followed your instruction for manual setup, and have now a virtual directory under the main application called siteSearch.

But any attempt to browse http://localhost/siteSearch/Webinator/dowalk
just returns a 404 error message.

Any other idea?
User avatar
jason112
Site Admin
Posts: 347
Joined: Tue Oct 26, 2004 5:35 pm

Not able to log in to Webinator

Post by jason112 »

Webinator currently requires the virtual directory to be named texis (I just saw the documentation wasn't clear on this, it will be changed).

Unfortunately virtual directories can't be renamed, so you'll need to delete "siteSearch" and create a "texis" virtual directory with the same wildcard application map as done before.
erling.ervik
Posts: 11
Joined: Fri Mar 20, 2009 4:36 am

Not able to log in to Webinator

Post by erling.ervik »

Thanks!
That did the trick. Now on to testing...
erling.ervik
Posts: 11
Joined: Fri Mar 20, 2009 4:36 am

Not able to log in to Webinator

Post by erling.ervik »

Now we have done a little testing. And it didn't feel so good. Of our 25 000+ pages in the site, Webinator only found 22. Of them 1 was indexed and 21 was rejected with error.

I have marked that the extension of the pages should be: .html .htm .txt .aspx .ascx - as this website is made in Microsoft .NET. There is not many .htm or .html pages. Almost all pages was of .aspx and .ascx (webcontrol) types.

There are some javascript in most pages, and since this is not supported in this demo version, that may be the reason why we got 21 pages rejected. Why it didn't find more than 22 pages is beyond me.

Since it takes a day to get answers (due to time difference between US and Norway). My understanding is that Webinator is not well suited to use on this websites made by MS .NET using EPIServer as a framework.

I don't think we can afford to spend much more time testing this further. Sorry about that.
User avatar
jason112
Site Admin
Posts: 347
Joined: Tue Oct 26, 2004 5:35 pm

Not able to log in to Webinator

Post by jason112 »

Webinator should have no problem crawling aspx and ascx files, just as a browser has no problem with them - Webinator interacts with the site as if it were a browser.

What were the errors listed for the other 21 URLs?
User avatar
jason112
Site Admin
Posts: 347
Joined: Tue Oct 26, 2004 5:35 pm

Not able to log in to Webinator

Post by jason112 »

Also, lack of a Javascript module will not cause pages to be rejected, the Javascript simply won't be executed. Worst case scenario, there may be some links that may not be followed if they're _generated_ by Javascript.

But lack of Javascript shouldn't be the cause of any errors seen.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Not able to log in to Webinator

Post by mark »

You'll probably need to remove ? from the exclusions and turn off "strip queries".

During testing you should set your rewalk type to new instead of refresh. You can also set verbosity to 4 to better see why it skips things. Put it back to 2 once it's running normally.

To see if the site can be indexed without javascript turn off javascript in your browser and see if you can navigate to the places you want to index.
erling.ervik
Posts: 11
Joined: Fri Mar 20, 2009 4:36 am

Not able to log in to Webinator

Post by erling.ervik »

I tried once more. Turned off "strip queries", and removed ? from exclusions.

Here is the log from the search:
Walk Status
Current User: webinator
Current Profile: NyTest Webinator 5.1.78-Windows-wo/plugin

Latest run:
0 pages in todo
1 pages scheduled to be refreshed in the next hour
1 pages visited in the last hour (1 success/0 failed)
1 pages in index


Pages recently walked
1 pages (63,270 bytes).
0 errors.
0 duplicate pages.

Page Visited Modified Url
-------+-------------------+-------------------+-------------------------------------------------------
1 Less than 1 min ago 1479 d, 1 hr+ ago http://localhost/ (63,270 bytes)

Recent errors
Visited Reason Url
--------------------+--------------------+-------------------------------------------------------

Next Pages to be walked
Next Check Modified Url
--------------------+------------------+-------------------------------------------------------
In 59 mins 1479 d, 1 hr+ ago http://localhost/ (63,270 bytes)

Webinator Walk Report for NyTest

Creating database C:\Program Files\Thunderstone Software\Webinator/db1...Done.
Walk started at 2009-03-25 11:31:37 (by user)
JavaScript walking not enabled by current license
HTTPS walking disabled
Start fetching at http://localhost/
Ignore urls containing any of the following:
/cgi-bin/



2009-03-25 11:31:37 started 1 new (5672) on http://localhost/
Using primer: http://localhost/
1 pages fetched (63,270 bytes) from http://localhost/ took 1 seconds
0 errors
0 duplicate pages

Creating search index on fetched pages...Done.
Creating spell-checker dictionaries...Done.
2009-03-25 11:31:40 0 Extra Indexes done
Done.
Verifying usability of new walk.

Walk finished at 2009-03-25 11:31:40 (took 2 seconds)
Please contact sales at Thunderstone Software to upgrade your license to include Best Bets.

Making new database live: C:\Program Files\Thunderstone Software\Webinator/db1

--------------------------------------------------------------------------------
Checking for broken hyperlinks...
No broken hyperlinks found. Nice Job!
Checking for duplicate pages...
No duplicate pages found.
--------------------------------------------------------------------------------
End of report.

I probably do something so stupid, that you have not think of it - yet.
Anyway doing the same with Xtreeme search Studio indexed 17318 pages. And Microsoft Search Server 2008 Express gave 21510.
I have a few more engines to try, so if you have other tips, you'r welcome.
Post Reply