Question
· Jun 23, 2016

Looking for 'dead' locks

Can anybody share a code looking for 'dead' locks?

It's not trivial to find such locks manually when there are many records in lock table.

Discussion (5)0
Log in or sign up to continue

Thank you Dmitry for response.

I need to detect only all mutual dependence of locks and locks attempts of different processes.

Dead lock sample:
1 process: Lock +^Glb(1)
2 process: Lock +^Glb(2)
1 process: Lock +^Glb(2)
2 process: Lock +^Glb(1)

1 process waiting for unlocking of ^Glb(2). But it's impossible because 2 processes is waiting unlocking of 1 process node.

^$LOCK contain information only about successful locks. Do detect 'dead' locks we need information about lock attempts too (%SYS.LockQuery:WebList query result that SMP displays).
SMP 'View Lock' mode can display thousands of records. But I need to filter only  mutual dependence of locks.

You can use the information from %SYS.LockQuery to graph the locks with their owners and waiters. Then do a depth-first traversal of each node, looking for a cycle.

Here's a sketch of building the graph:

s rs=##class(%ResultSet).%New("%SYS.LockQuery:Detail")
s status=rs.Execute()
k graph
f i=1:1 q:'rs.%Next()  d
. s ref="L"_i,graph(ref,rs.Owner)=1
. f j=1:1:$l(rs.WaiterPID," ") d
. . s pid=$p(rs.WaiterPID," ",j) s:pid]"" graph(pid,ref)=1

The graph looks something like this:

graph(3330,"L5")=1
graph(4380,"L4")=1
graph("L1",3309)=1
graph("L2",3326)=1
graph("L3",3327)=1
graph("L4",3330)=1
graph("L5",4380)=1

I've generated IDs for the locks to avoid a SUBSCRIPT error for long references. You'll want to keep a list of the original lock names.

Here's a (minimally tested) traversal method that returns an error if it finds a cycle:

ClassMethod dfs(byref graph, node as %String, byref visited) as %Status {
    s status=$$$OK
    i $d(node) d
    . i $d(visited(node)) d  q
    . . s status=$$$ERROR($$$GeneralError,"found a cycle at node "_node)
    . s visited(node)=1
    . s next=""
    . f  s next=$o(graph(node,next)) q:""=next  d  q:$$$ISERR(status)
    . . s status=..dfs(.graph,next,.visited)
    e  d
    . s root=""
    . f  s root=$o(graph(root)) q:""=root  d  q:$$$ISERR(status)
    . . k visited
    . . s status=..dfs(.graph,root,.visited)
    q status
}

If you try it on the previous graph, it will return an error like the following:

USER>s status=##class(deadlock).dfs(.graph) 

USER>d $system.OBJ.DisplayError(status)    

ERROR #5001: found a cycle at node L5

Many thanks John for idea. It's more easier to use WaiterPID values.

One correction to your code:

s rs=##class(%ResultSet).%New("%SYS.LockQuery:Detail")
s status=rs.Execute()
k graph
f i=1:1 q:'rs.%Next()  d
. s ref="L"_i,graph(ref,rs.Owner)=1
. f j=1:1:$l(rs.WaiterPID,",") d
. . s pid=$p(rs.WaiterPID,",",j) s:pid]"" graph(pid,ref)=1

WaiterPID values are separated by "," not by " "