Page 1 of 2

Semaphores on Tru64

Posted: Tue Nov 04, 2003 11:57 am
by MiniMe
We just recently upgraded to Tru64 5.1b to fix a bug in our SAN. But now it appears that texis runs out of Semaphores every 48 hours or so. Have you guys had any experiences like this with Tru64? It's most annoying to be woken at 4am.. ;>)

Semaphores on Tru64

Posted: Tue Nov 04, 2003 1:18 pm
by mark
No. Texis will keep using the same semaphore for any given database unless it disappears or is otherwise unaccessable. Make sure all of your texis processes are running as the same non-root user all of the time. And that there's nothing going around deleting shared mems.

Check your monitor.log and vortex.log files to see if there are any complaints about semaphores or shared mem segments or anything permission related.

Semaphores on Tru64

Posted: Tue Nov 04, 2003 1:25 pm
by MiniMe
there are these errors..

200 Nov 4 01:38:38 (76022) Removed semaphore (pid=0) in the function Clear semaphore
014 Nov 4 01:38:38 (76022) Invalid argument in the function semunlock
200 Nov 4 01:39:08 (76022) Removed semaphore (pid=0) in the function Clear semaphore
000 Nov 4 01:42:00 (76022) Unable to remove semaphore: Invalid argument in the function Clear semaphore
200 Nov 4 01:44:23 (76022) Removed semaphore (pid=0) in the function Clear semaphore
014 Nov 4 01:44:23 (76022) Invalid argument in the function semunlock
200 Nov 4 01:47:40 (76022) Removed semaphore (pid=0) in the function Clear semaphore
000 Nov 4 01:48:25 (76022) Unable to remove semaphore: Invalid argument in the function Clear semaphore
200 Nov 4 01:48:46 (76022) Removed semaphore (pid=0) in the function Clear semaphore
000 Nov 4 01:51:16 (76022) Unable to remove semaphore: Invalid argument in the function Clear semaphore
200 Nov 4 01:51:47 (76022) Removed semaphore (pid=0) in the function Clear semaphore
000 Nov 4 01:55:06 (76022) Unable to remove semaphore: Invalid argument in the function Clear semaphore
000 Nov 4 01:55:16 (76022) Unable to remove semaphore: Invalid argument in the function Clear semaphore

Semaphores on Tru64

Posted: Tue Nov 04, 2003 1:47 pm
by mark
Off the top it looks like your shared mem is messed up somehow. Check perms. Try "rmlocks -f" once on the database in question when it's not in use.

Semaphores on Tru64

Posted: Tue Nov 04, 2003 2:01 pm
by John
The other possible cause of that, seeing as you are using a SAN, would be if you have two different servers trying to access the same database. The semaphore is only valid on one or other of the servers, so each tries to create their own.

Semaphores on Tru64

Posted: Tue Nov 04, 2003 2:41 pm
by MiniMe
Well we have an HSG80 SAN but there is only one head unit attached to it. I will try the rmlocks -f when I get a chance. Have you guys run 5.1b in house yet? Could it be an OS conflict of some sort?

Semaphores on Tru64

Posted: Tue Nov 04, 2003 3:09 pm
by mark
Don't have 5.1b.

If all was working across reboots before the OS update I would suspect a new OS bug was introduced or something about kernel settings has changed.

If you don't know whether it worked across reboots before the update (since unix machines tend not to be booted much) triple check your permissions and what user texis might run as under every circumstance (rc scripts, cgi, cron, etc). Make sure monitor and related programs are setuid to the DBA to help ensure that it's always run as the correct user.

Semaphores on Tru64

Posted: Wed Nov 05, 2003 10:21 am
by MiniMe
I tried the rmlocks -f and I get this error.

monitor: failed to start, exit code 9 (Texis Monitor already running)

I checked permissions on everything and all looks good.

Semaphores on Tru64

Posted: Wed Nov 05, 2003 11:38 am
by mark
Sounds like perms problem. It can't read the shared mem telling it the monitor is already running, so it tries to start one automatically. That monitor fails because the shared mem already exists.

Check ownership of the shared mems with your system's "ipcs" command. Make sure monitor and all texis programs that are currently running (use the "ps" command) and any that might run (INSTALLDIR/bin/*, CGIDIR/texis, any API programs you've created) are always running as the same unix userid.

Semaphores on Tru64

Posted: Wed Nov 05, 2003 11:43 am
by MiniMe
Here is what ps currently shows..

database 46985 0.0 0.0 2.32M 232K pts/0 S + 08:42:12 0:00.00 grep monitor
database 44951 0.0 0.0 9.82M 624K ?? S 08:30:00 0:00.02 monitor -d /db/tables/maint/ -z
database 8788 0.0 0.0 9.93M 720K ?? S 01:42:45 0:00.95 monitor -d /db/tables/prod/ -z
database 432 0.0 0.0 9.82M 608K ?? S 23:09:33 0:00.31 monitor -d /usr/local/morph3/texis/testdb/ -z

and ipcs shows this

Semaphores:
T ID KEY MODE OWNER GROUP CREATOR CGROUP NSEMS OTIME CTIME
s 0 0x696e6974 --ra-r--r-- root system root system 8 23:09:02 23:09:02
s 1 0x41e5030f --ra------- root system root system 1 23:09:01 23:09:01
s 2 0 --ra-ra-ra- database system database system 1 8:43:06 23:09:33
s 1027 0 --ra-ra-ra- database database database database 1 no-entry 1:30:11
s 4 0x79e525ad --ra-ra-ra- database database database database 1 8:43:10 23:19:04
s 2053 0 --ra-ra-ra- database database database database 1 no-entry 1:31:41
s 6 0 --ra-ra-ra- database database database database 1 no-entry 1:31:47
s 7 0 --ra-ra-ra- database database database database 1 no-entry 1:31:53
s 8 0 --ra-ra-ra- database database database database 1 no-entry 1:31:59
s 9 0 --ra-ra-ra- database database database database 1 no-entry 1:32:05
s 10 0 --ra-ra-ra- database database database database 1 no-entry 1:32:11
s 11 0 --ra-ra-ra- database database database database 1 no-entry 1:32:17
s 12 0 --ra-ra-ra- database database database database 1 no-entry 1:32:23
s 13 0 --ra-ra-ra- database database database database 1 no-entry 1:32:29
s 14 0 --ra-ra-ra- database database database database 1 no-entry 1:32:35
s 15 0 --ra-ra-ra- root daemon root daemon 1 8:43:15 1:32:42
s 4112 0 --ra-ra-ra- database daemon database daemon 1 8:43:15 8:30:00