Large indexes and live search

User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Large indexes and live search

Post by mark »

Doc files were larger than max page size and got truncated. Need to increase max page size under all walk settings.
scott.shaver
Posts: 45
Joined: Tue May 31, 2005 12:13 pm

Large indexes and live search

Post by scott.shaver »

They were identical. I had to kill the texis process for the walk that was slow because it wouldn't stop. I deleted the index that wasn't working. I created another index as a duplicate of the slow one. I changed to max size to 10 meg and set the primer URL to None. Tweaked a couple of settings about the java script stuff. Upped the verbosity to 4. Tried to run the new one and got the walk status page shown below. What is going on, why won't it walk the file server at all now?




Walk Status
Current User: webinator
Current Profile: evergreen_corp2 Webinator 5.1.29-Windows-w/plugin

Latest run:
0 pages in todo
0 pages scheduled to be refreshed in the next hour
2 pages visited in the last hour (1 success/1 failed)
1 pages in index


Pages recently walked
1 pages (0 bytes).
1 errors.
0 duplicate pages.

Page Visited Modified Url
-------+-------------------+-------------------+-------------------------------------------------------
1 2 mins ago 2 mins ago file://evergreen/corp/ (0 bytes)

Recent errors
Visited Reason Url
--------------------+--------------------+-------------------------------------------------------
2 mins ago Document not found: file://evergreen/corp/

Next Pages to be walked
Next Check Modified Url
--------------------+------------------+-------------------------------------------------------
In 6 d, 23 hr+ 2 mins ago file://evergreen/corp/ (0 bytes)


Walk started at 2006-03-21 12:43:35 (by resume)
Verbosity set to 4
JavaScript walking disabled
HTTPS walking disabled
Start fetching at file://evergreen/corp/
file://evergreen/corp/
Ignore urls containing any of the following:
/cgi-bin/
~
?
/private
started 1 refresh (1880) on file://evergreen/corp/
0 pages fetched (0 bytes) from file://evergreen/corp/
1 errors

Updating search index ...Done.
Creating spell-checker dictionaries...Done.
Done.
Verifying usability of new walk.

Walk finished at 2006-03-21 12:43:42 (took 3 seconds)
Keeping database live: s:\texisindexes\evergreen_corp2/db2

--------------------------------------------------------------------------------
Checking for broken hyperlinks...

The link : file://evergreen/corp/
Had this error: Document not found: file:// document from file \\evergreen\corp\: The system cannot find the path specified
--------------------------------------------------------------------------------
End of report.
scott.shaver
Posts: 45
Joined: Tue May 31, 2005 12:13 pm

Large indexes and live search

Post by scott.shaver »

well this is interesting, I just set up an identical index on the test box and it works over there, I don't know what it going on.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Large indexes and live search

Post by mark »

The machine where it's not working has permission to access to \\evergreen\corp, right?
Does the webserver run as the same user with the same perms on both systems? Did someone go around locking down or cleaning up things on the non-functioning server? We've had reports of things working on dev machines but not production machines or vice-versa because they weren't true clones of each other.
scott.shaver
Posts: 45
Joined: Tue May 31, 2005 12:13 pm

Large indexes and live search

Post by scott.shaver »

I'm getting there the issue seems to be a problem with how win2003, win2k, texis and DFS interact. I'll have more info tomorrow.
scott.shaver
Posts: 45
Joined: Tue May 31, 2005 12:13 pm

Large indexes and live search

Post by scott.shaver »

Okay so the test machine is win2k and the \\evergreen\corp network share is a set of DFS directories. The test box works fine walking the share.

The production box is win 2003 server. It will not walk the share from the root for some reason. I have to tell it to start inside of a directory deeper in the tree which is obviously not what I want.

I have no idea why the prod box worked the first time and now doesn't.

There appears to be a difference between win2k and 2003 when it comes to Texis indexing DFS shares.
scott.shaver
Posts: 45
Joined: Tue May 31, 2005 12:13 pm

Large indexes and live search

Post by scott.shaver »

any ideas here guys? I really need to be able to start at the root.
User avatar
Kai
Site Admin
Posts: 1272
Joined: Tue Apr 25, 2000 1:27 pm

Large indexes and live search

Post by Kai »

Which version of texis is running on the machine which has the problem (output of texis -version)?
scott.shaver
Posts: 45
Joined: Tue May 31, 2005 12:13 pm

Large indexes and live search

Post by scott.shaver »

Texis Web Script (Vortex) Copyright (c) 1996-2006 Thunderstone - EPI, Inc.
Commercial Version 5.01.1137083200 20060112 (i686-intel-winnt-64-32)

Same user account from the domain is being used on both machines.

There are about 5 folders in the root which my account (don't know about the indexing account) can't access. I'm trying to find out if they are invalid or if it is just a permissions thing. Will an invalid link cause the walk to not work?
scott.shaver
Posts: 45
Joined: Tue May 31, 2005 12:13 pm

Large indexes and live search

Post by scott.shaver »

LOL, bummer
Post Reply