Maybe we've got a cursed table, I'm not sure, but these messages appear when trying to select from it:
000 Could not create semaphore (No space left on device)
000 Unable to obtain semaphore
000 Could not open locking mechanism in the function ddopen
000 Couldn't connect to . in the function SQLConnect
It's a very small table (1 record), and the kicker is, even when we delete it and try to recreate, that message keeps coming back up. The disk we're using is only 35% full, too.
Even when we created a new directory and renamed the table, that error still came up. We're not having trouble selecting from other tables on this machine, so it appears isolated to this one we're trying to make.
Semaphores are a kernel resource. No space means the kernel will not allow more semaphores. It's nothing to do with memory or disk. Use "ipcs" to list semaphores, "ipcrm" to delete stale ones. You may just need to increase the number of semaphores the system allows. You need one per database.
This is from sysdef -i. Do you know if any of the semaphore parameters below would look questionable to texis' eyes? We've got.. maybe 15 databases on this Solaris box of ours.
* IPC Semaphores
*
10 semaphore identifiers (SEMMNI)
60 semaphores in system (SEMMNS)
200 undo structures in system (SEMMNU)
25 max semaphores per id (SEMMSL)
10 max operations per semop call (SEMOPM)
10 max undo entries per process (SEMUME)
32767 semaphore maximum value (SEMVMX)
16384 adjust on exit max value (SEMAEM)
Texis will use one semaphore identifier for each database, so you should make sure SEMMNI is larger than the number of databases. If it gets larger than SEMMNS you would need to increase that as well.
Yeah, 10 SEMMNI < 15. Sun's defaults are rather low. Try this in your /etc/system file. A reboot is required to activate.
* Number of undo structures in system:
set semsys:seminfo_semmnu = 200
* Max number of undo entries per process:
set semsys:seminfo_semume = 200
* Max number of semaphore identifiers:
set semsys:seminfo_semmni = 200
* Max entries in semaphore map:
set semsys:seminfo_semmap = 40
Interesing explanation. I am currently creating only one database, and texis first generated three, then added a fourth, fifth, sixth, seventh and eighth semaphore as time progressed ... Once the eighth was created, it was only a matter of time before the dreaded "Could not create Semaphore" message apeared, and the walk status page crashed out ...
I thought it might be to do with the threading (I originally used 4 threads) so I cut it back to two threads, but still the same symptoms. How do you define a "database" in the above explanation ??
We will be upping out semaphore count tonight and I'll try again.
Sounds like you have permission issues such as not always running texis as the same userid. You can delete the old semaphores.
Make sure you are always running texis and it's programs as the same userid, whether from the webserver's CGI or from the command line. The installation attempts to ensure this by making the programs setuid to the correct user and will warn if it can't. Check your texis programs for being setuid. Then make sure all of the texis database directories and files are owned by the correct user.
Also look in INSTALLDIR/texis/monitor.log and INSTALLDIR/texis/vortex.log to check for errors.
Thanks for the fast reply Mark. I did check the logs - no other reported problems. I looked at the +S bit - all set on the installed programs (monitor, gw, anytotx)
I'm trying with the larger semaphore count right now ...
That will only hide the problem for a little longer. It really won't use more than one semaphore per database if everything's running correctly. If it's losing them it will probably continue to lose them.
Try to spot what action causes a new semaphore to be created. Are all of the semaphores owned by the same user? What version of Texis(texis -version) and Webinator(displayed on the walk settings page) are you using?
Were the additional semaphores created while the walk was going on? Each database should only require one semaphore. If the database, and in particular the SYSLOCKS file, is removed without using <rmlocks> the semaphore may get stranded.
A new semaphore is created if the old one no longer seems to exist. Possible causes would be accessing the same database over a network file system from two separate machines, as each will have their own semaphore, permission problems, or removing the SYSLOCKS file so it no longer knows which semaphore to use.