Semaphores on Tru64

MiniMe
Posts: 210
Joined: Thu Mar 15, 2001 4:30 pm

Semaphores on Tru64

Post by MiniMe »

We just recently upgraded to Tru64 5.1b to fix a bug in our SAN. But now it appears that texis runs out of Semaphores every 48 hours or so. Have you guys had any experiences like this with Tru64? It's most annoying to be woken at 4am.. ;>)
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Semaphores on Tru64

Post by mark »

No. Texis will keep using the same semaphore for any given database unless it disappears or is otherwise unaccessable. Make sure all of your texis processes are running as the same non-root user all of the time. And that there's nothing going around deleting shared mems.

Check your monitor.log and vortex.log files to see if there are any complaints about semaphores or shared mem segments or anything permission related.
MiniMe
Posts: 210
Joined: Thu Mar 15, 2001 4:30 pm

Semaphores on Tru64

Post by MiniMe »

there are these errors..

200 Nov 4 01:38:38 (76022) Removed semaphore (pid=0) in the function Clear semaphore
014 Nov 4 01:38:38 (76022) Invalid argument in the function semunlock
200 Nov 4 01:39:08 (76022) Removed semaphore (pid=0) in the function Clear semaphore
000 Nov 4 01:42:00 (76022) Unable to remove semaphore: Invalid argument in the function Clear semaphore
200 Nov 4 01:44:23 (76022) Removed semaphore (pid=0) in the function Clear semaphore
014 Nov 4 01:44:23 (76022) Invalid argument in the function semunlock
200 Nov 4 01:47:40 (76022) Removed semaphore (pid=0) in the function Clear semaphore
000 Nov 4 01:48:25 (76022) Unable to remove semaphore: Invalid argument in the function Clear semaphore
200 Nov 4 01:48:46 (76022) Removed semaphore (pid=0) in the function Clear semaphore
000 Nov 4 01:51:16 (76022) Unable to remove semaphore: Invalid argument in the function Clear semaphore
200 Nov 4 01:51:47 (76022) Removed semaphore (pid=0) in the function Clear semaphore
000 Nov 4 01:55:06 (76022) Unable to remove semaphore: Invalid argument in the function Clear semaphore
000 Nov 4 01:55:16 (76022) Unable to remove semaphore: Invalid argument in the function Clear semaphore
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Semaphores on Tru64

Post by mark »

Off the top it looks like your shared mem is messed up somehow. Check perms. Try "rmlocks -f" once on the database in question when it's not in use.
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Semaphores on Tru64

Post by John »

The other possible cause of that, seeing as you are using a SAN, would be if you have two different servers trying to access the same database. The semaphore is only valid on one or other of the servers, so each tries to create their own.
John Turnbull
Thunderstone Software
MiniMe
Posts: 210
Joined: Thu Mar 15, 2001 4:30 pm

Semaphores on Tru64

Post by MiniMe »

Well we have an HSG80 SAN but there is only one head unit attached to it. I will try the rmlocks -f when I get a chance. Have you guys run 5.1b in house yet? Could it be an OS conflict of some sort?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Semaphores on Tru64

Post by mark »

Don't have 5.1b.

If all was working across reboots before the OS update I would suspect a new OS bug was introduced or something about kernel settings has changed.

If you don't know whether it worked across reboots before the update (since unix machines tend not to be booted much) triple check your permissions and what user texis might run as under every circumstance (rc scripts, cgi, cron, etc). Make sure monitor and related programs are setuid to the DBA to help ensure that it's always run as the correct user.
MiniMe
Posts: 210
Joined: Thu Mar 15, 2001 4:30 pm

Semaphores on Tru64

Post by MiniMe »

I tried the rmlocks -f and I get this error.

monitor: failed to start, exit code 9 (Texis Monitor already running)

I checked permissions on everything and all looks good.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Semaphores on Tru64

Post by mark »

Sounds like perms problem. It can't read the shared mem telling it the monitor is already running, so it tries to start one automatically. That monitor fails because the shared mem already exists.

Check ownership of the shared mems with your system's "ipcs" command. Make sure monitor and all texis programs that are currently running (use the "ps" command) and any that might run (INSTALLDIR/bin/*, CGIDIR/texis, any API programs you've created) are always running as the same unix userid.
MiniMe
Posts: 210
Joined: Thu Mar 15, 2001 4:30 pm

Semaphores on Tru64

Post by MiniMe »

Here is what ps currently shows..

database 46985 0.0 0.0 2.32M 232K pts/0 S + 08:42:12 0:00.00 grep monitor
database 44951 0.0 0.0 9.82M 624K ?? S 08:30:00 0:00.02 monitor -d /db/tables/maint/ -z
database 8788 0.0 0.0 9.93M 720K ?? S 01:42:45 0:00.95 monitor -d /db/tables/prod/ -z
database 432 0.0 0.0 9.82M 608K ?? S 23:09:33 0:00.31 monitor -d /usr/local/morph3/texis/testdb/ -z

and ipcs shows this

Semaphores:
T ID KEY MODE OWNER GROUP CREATOR CGROUP NSEMS OTIME CTIME
s 0 0x696e6974 --ra-r--r-- root system root system 8 23:09:02 23:09:02
s 1 0x41e5030f --ra------- root system root system 1 23:09:01 23:09:01
s 2 0 --ra-ra-ra- database system database system 1 8:43:06 23:09:33
s 1027 0 --ra-ra-ra- database database database database 1 no-entry 1:30:11
s 4 0x79e525ad --ra-ra-ra- database database database database 1 8:43:10 23:19:04
s 2053 0 --ra-ra-ra- database database database database 1 no-entry 1:31:41
s 6 0 --ra-ra-ra- database database database database 1 no-entry 1:31:47
s 7 0 --ra-ra-ra- database database database database 1 no-entry 1:31:53
s 8 0 --ra-ra-ra- database database database database 1 no-entry 1:31:59
s 9 0 --ra-ra-ra- database database database database 1 no-entry 1:32:05
s 10 0 --ra-ra-ra- database database database database 1 no-entry 1:32:11
s 11 0 --ra-ra-ra- database database database database 1 no-entry 1:32:17
s 12 0 --ra-ra-ra- database database database database 1 no-entry 1:32:23
s 13 0 --ra-ra-ra- database database database database 1 no-entry 1:32:29
s 14 0 --ra-ra-ra- database database database database 1 no-entry 1:32:35
s 15 0 --ra-ra-ra- root daemon root daemon 1 8:43:15 1:32:42
s 4112 0 --ra-ra-ra- database daemon database daemon 1 8:43:15 8:30:00
Post Reply