I already have lock

Post Reply
jay.upchurch
Posts: 18
Joined: Thu Jan 10, 2002 11:12 am

I already have lock

Post by jay.upchurch »

Since last Thursday we are seeing an abnormal number of messages in the monitor.log and vortex.log logs. In the monitor.log file, I am seeing random groups of entries such as:

100 Jun 16 01:38:20 I already have lock
100 Jun 16 01:39:20 I already have lock
100 Jun 16 01:40:20 I already have lock
200 Jun 16 01:40:50 Database Monitor on /web/international/webinator-4.0/texis/USEnglish/db1/ exiting (pid 9728)
100 Jun 16 01:41:30 I already have lock
000 Jun 16 01:42:20 Unable to remove semaphore: Not owner in the function Clear semaphore
100 Jun 16 01:43:30 I already have lock
000 Jun 16 01:44:20 Unable to remove semaphore: Not owner in the function Clear semaphore
100 Jun 16 01:45:20 I already have lock
000 Jun 16 01:46:20 Unable to remove semaphore: Not owner in the function Clear semaphore
100 Jun 16 01:48:20 I already have lock
200 Jun 16 01:48:34 Database Monitor on /web/international/webinator-4.0/texis/USEnglish/db1/ starting (pid 11014)
000 Jun 16 01:49:30 Unable to remove semaphore: Not owner in the function Clear semaphore
000 Jun 16 01:50:20 Unable to remove semaphore: Not owner in the function Clear semaphore
200 Jun 16 01:51:34 Database Monitor on /web/international/webinator-4.0/texis/USEnglish/db1/ exiting (pid 11014)
100 Jun 16 01:52:01 Broke semaphore [5/1] in the function monlock
000 Jun 16 01:53:27 Unable to remove semaphore: Not owner in the function Clear semaphore
100 Jun 16 01:54:27 I already have lock
100 Jun 16 01:55:27 I already have lock
100 Jun 16 01:56:27 I already have lock
100 Jun 16 01:57:17 Broke semaphore [5/1] in the function monlock
000 Jun 16 01:57:37 Unable to remove semaphore: Not owner in the function Clear semaphore
100 Jun 16 01:58:27 I already have lock
000 Jun 16 01:59:27 Unable to remove semaphore: Not owner in the function Clear semaphore
000 Jun 16 02:00:58 Unable to remove semaphore: Not owner in the function Clear semaphore

In the vortex.log file for the corresponding time I am seeing the following entries:

100 Jun 15 19:05:13 /webinator/mbnaUSEnglishSearch: I already have lock
000 Jun 16 01:52:50 /webinator/mbnaUSEnglishSearch:48: Timeout
000 Jun 16 08:53:38 /webinator/mbnaUSEnglishSearch:48: Timeout
100 Jun 16 15:38:13 /webinator/mbnaUSEnglishSearch: I already have lock

I'm concerned about both the semaphore messages and the timeout error. I believe the timeout error threw an error to the web site user.

One last thing, I checked the file permissions for all files installed with Webinator. They appear to all be owned by the webserver owner.

Any suggestions?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

I already have lock

Post by mark »

Is the monitor running as the same user?

How about the owner of the semaphore(s)? Do you have any texis jobs running as another user? Perhaps maintenance tasks outside of the webserver setting.
jay.upchurch
Posts: 18
Joined: Thu Jan 10, 2002 11:12 am

I already have lock

Post by jay.upchurch »

Yes, everything should be executing and running as a process under the same non-root user. The installation is under the non-root users home directory area. However, the license.key file group ownership is currently owned by "javagroup (613)" - not the same non-root user's group. Is this right?

I'm not sure what a semaphore(s) is in terms of the webinator process. It can't be a solaris kernel memory thing or I'd be seeing errors in the system logs, right? No other texis processes are executing under other user accounts. No maintenance outside of the non-root user. The night walks occur under the same non-root user.
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

I already have lock

Post by John »

The group might be different, however the fact that the error message is saying "Not Owner" means that texis has run as different users at some point in time. The semaphores are the Solaris kernel objects. The simplest way to avoid permissions problems it to make sure that the executables are setuid to the non-root user, so that even if executed from a program running as a different user it will not cause problems.
John Turnbull
Thunderstone Software
sfishback
Posts: 2
Joined: Thu Jun 20, 2002 8:05 am

I already have lock

Post by sfishback »

John,
There seems like there's contention for a semaphore
or some sort of database lock between the testdb and mbnaUSenglish databases.
Can you point me to UNIX Admin guide for administrating webinator?

UNIX Admin questions for webinator I have are:
1) What process should be running and should they be running all the time?
I'm seeing two permanent processes cgi-bin/texis and "monitor -d ....testdb/ -z"
One process comes and goes with clients performing searches which is named mbnaUSenglish.

2) What is the semaphore that's having problems?
I'm not seeing anything in the Solaris OS system logs.
100 Jun 20 07:58:08 Broke semaphore [5/1] in the function monlock
000 Jun 20 07:58:28 Unable to remove semaphore: Not owner in the function Clear semaphore

3) FYI - we have two duplicate instances of webinator executing on two separate but exactly the same Sun Solaris servers. One running fine with no errors and this one with locking/semaphore errors.
jay.upchurch
Posts: 18
Joined: Thu Jan 10, 2002 11:12 am

I already have lock

Post by jay.upchurch »

We discovered the "ltest" command which shows the current state of locks on the database this morning. We ran this command on both Solaris servers. On the first server we see:

Texis Version 03.01.1423 Semaphore: 5(1) Available Locks 2000
Number of servers: 3 Database /web/international/webinator-4.0/texis/testdb/SYSLOCKS

SERID PID
0 29042
1 25403
2 [ltest]

Table Name Table Name
SYSTABLES(44) SYSINDEX(43)
SYSUSERS(43) SYSPERMS(43)
SYSTRIG(43) SYSSCHEDULE(7)
SYSSTATISTICS(820) options(19)

This is the problematic instance. After running it, it quickly start rolling semaphore error messages and I had to kill my SSH session to break out.

On the second Solaris server we see:

Texis Version 03.01.1423 Semaphore: 9(1) Available Locks 2000
Number of servers: 3 Database /web/international/webinator-4.0/texis/testdb/SYSLOCKS

SERID PID
0 985
1 938
2 [ltest]

Table Name Table Name
SYSTABLES(7) SYSINDEX(7)
SYSUSERS(7) SYSPERMS(7)
SYSTRIG(7) SYSSTATISTICS(1079)
SYSSCHEDULE(3)

This is the stable version of Webinator. When we run this command, we do not see the semaphore issue as we encountered on the first box.

Does this shed any light on the issue?
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

I already have lock

Post by John »

Not really, they both look similar, and correct. Are you still seeing the "Not Owner" message? That indicates that at least one of your executables is not setuid to the Texis user.
John Turnbull
Thunderstone Software
sfishback
Posts: 2
Joined: Thu Jun 20, 2002 8:05 am

I already have lock

Post by sfishback »

John, Can you answer my question from thread #6?

Also, how are the similar and correct? They look completely different to me. One says "Semaphore: 9(1)" and the other which we have problems says "Semaphore: 5(1)".
A short time after performing the ltest command it spin off a non-stop rolling "unable to obtain semaphore" message.

To answer your question: yes, we are seeing the 'not owner' messages. Everything is owned by the same user as the texis process. This really doesn't add up.
Here's a couple snippets:

000 Jun 20 14:04:28 Unable to remove semaphore: Not owner in the function Clear semaphore
100 Jun 20 14:05:45 I already have lock
000 Jun 20 14:05:45 Could not create locks for SYSMETAINDEX in the function opendbtb

Help us out here John. Are there any tools or other people that we can escalate this to?
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

I already have lock

Post by John »

If you use the ipcs command then you should see the semaphores, one has been given semaphore id 5 and one semaphore id 9 by the OS. If you are seeing "Not Owner" messages then you are running Texis processes as a different user.

You can contact tech support directly rather than use the message boards.
John Turnbull
Thunderstone Software
Post Reply