RAC internals - GLOBAL ENQUEUE SERVICES DEADLOCK DETECTED [Oracle]

Prev: SP2-0332: Cannot create spool file
Next: Strange SQL

From: yong321 on 10 Nov 2005 16:23

Jonathan Lewis wrote:
> <yong321(a)yahoo.com> wrote in message
> news:1131135857.443704.31690(a)g47g2000cwa.googlegroups.com...
> > Dusan Bolek wrote:
> >> > Metalink Note:262226.1 offers some interpretation. Your process 131321
> >> > on node 1 is waiting for a TX lock in mode 5 (the two hex numbers,
> >> > 0x2b90011 and 0x5f20, may be id1 and id2 in v$lock, respectively).
> >> > Process 131317 on the same node is holding it.
> >>
> > You're right. I did some testing and find that the two numbers,
> > [131321,1285], in your case, do not in any way denote a process
> > (Note:262226.1 says the first number is PID). Instead they correspond
> > to transaction_id0 and transaction_id1 of v$ges_blocking_enqueue,
> > respectively (or the same in v$dlm_locks). Documentation says they're
> > lower and upper 4 bytes of the transaction identifier where the lock
> > belongs to. I can't find more information about it. Perhaps for our
> > purpose, we can conceptually think of the combination of the two
> > numbers, i.e. a transaction identifier, as a process identifier.
> >
> > By the way, I do see the SQL involved in the global deadlock (tested in
> > 9.2.0.7.0 on Linux):
> > ...
> > *** 2005-11-04 13:38:33.199
> > user session for deadlock lock 0x7553ab14
> > ...
> > Current SQL Statement:
> > update test set a = :"SYS_B_0" where a = :"SYS_B_1"
> > Global Wait-For-Graph(WFG) at ddTS[0.28] :
> > BLOCKED 0x7553ab14 5 [0xf001d][0x8353],[TX] [2162689,7995] 0
> > ...
> >
> > In any case, follow Jonathan's practical advice.
> >
> > Yong Huang
> >
>
> Given that the documentation and notes are wrong about the
> process id / transaction id - might the comment about the
> mode 5 lock also be wrong ?
>
> Could you see a transaction holding or requesting a TX mode 5
> in the few seconds before the deadlock was reported ?
>
>
> --
> Regards
>
> Jonathan Lewis
>
> http://www.jlcomp.demon.co.uk/faq/ind_faq.html
> The Co-operative Oracle Users' FAQ
>
> http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html
> Cost Based Oracle: Fundamentals
>
> http://www.jlcomp.demon.co.uk/appearances.html
> Public Appearances - schedule updated 4th Nov 2005

I just tested again. I do not see lock mode 5. When the two sessions
involving a deadlock were hung, the session that later detected ORA-60
had this in v$lock:

SID TY ID1 ID2 LMODE REQUEST CTIME
BLOCK
--- -- ---------- ---------- ---------- ---------- ----------
----------
133 TX 327689 60114 6 0 72
2
133 TM 114196 0 3 0 72
2
133 TX 852009 33959 0 6 60
0

and these in v$ges_blocking_enqueue:

HANDLE GRANT_LEV REQUEST_L OPEN_OPT_DEADLOCK WHICH_QUEUE STATE
-------- --------- --------- ----------------- ----------- -----
68D149C0 KJUSEREX KJUSEREX 1 1 GRANTED
68D157E4 KJUSERNL KJUSEREX 1 2 OPENING

After this session detected ORA-60, followed by a few seconds delay,
the last row in v$lock output (where LMODE=0) is gone and the second
row in v$ges_blocking_enqueue is also gone. The lmd0 process trace file
shows:

Global Wait-For-Graph(WFG) at ddTS[0.5e] :
BLOCKED 0x68d157e4 5 [0xd0029][0x84a7],[TX] [2621441,14115] 0
BLOCKER 0x6d6964bc 5 [0xd0029][0x84a7],[TX] [1835010,4352] 1
BLOCKED 0x6c633798 5 [0x50009][0xead2],[TX] [1835010,4352] 1
BLOCKER 0x68d149c0 5 [0x50009][0xead2],[TX] [2621441,14115] 0

So you're right. According to v$lock, there's no mode 5 lock involved.
In fact, showing 5 on all four lines in this Wait-For-Graph plus the
fact that 5 is the only number I see in all WFGs I find makes me wonder
if this number is hardcoded in Oracle code.

Yong Huang

From: Dusan Bolek on 11 Nov 2005 07:30

This code definitely came from an application server. They are storing
all application servers sessions there. It looks like there was a lock
on this table so none can login in or logout. Our application vendor is
still investigating its code, it looks like there are more issues to be
solved. However, I'm glad that I have been able to get some meaningful
information from a deadlock trace file. By getting txn id from the
trace file, I can find the offending SQL in a log miner output and that
is helping a lot to solve deadlock related problems.
Another interesting thing about the deadlock tracing that I've found is
that if a deadlock occurs between two transactions on the same node
there is no SQL in the generated tracefile, but if these transactions
were on different nodes it would leave an SQL in the trace file on one
of the affected nodes. Do not know why, but it looks like it works this
way.

--
Dusan Bolek

First | Prev |
Pages: 1 2 3 4
Prev: SP2-0332: Cannot create spool file
Next: Strange SQL