Jul 6, 2023

Are there any known causes of IRIS entering a deadlock/hang state?

Based on your experience, do you know any reason why IRIS would enter a deadlock/hang state ? 

When such thing occurs, it's no more possible to connect to Portal or Studio, despite IRIS service (IRIS.EXE processes) being still active. CPU/memory/network usage are usually very low (eg: it does not occurs because server is overloaded). The only fix is a full restart of IRIS (eg: by clicking on IRIS icon in notification toolbar and choosing appropriate action).

I had that issue on a production server a few weeks ago. Any request sent to IRIS would lead to a timeout (and it was no more possible to enter Studio or Portal). The only solution was a restart of IRIS service. Apache seems fine. Inspecting the logs or doing a performance report (^pButtons) in the next days did not help to find what went wrong.

I did some research and find out at least two ways to recreate similar behavior : 

1) too many locks created (much more than what locksiz parameter can allow).
This simple loop will crash system in a few seconds (do not try it !), needing a full restart.

for i=1:1:1000000
    set ^A(i) = ""
    lock +^A(i)

Since locks are using shared memory (specified by gmheap parameter), is there a possibility of something else (eg: string allocation) using a lot of shared memory (thus leaving very little for locks themselves) ?

2) there is no more space on disk where journal is located.

Do you know any other reasons that can lead to system being down (the symptoms I describe in the top of my post) ?

Product version: IRIS 2021.1
$ZV: IRIS for Windows (x86-64) 2021.1 (Build 215U) Wed Jun 9 2021 09:39:22 EDT
"Deadlock" is too broad to describe any possibility that could cause the instance to hang. I would recommend reaching out to the WRC/support when that occurs so they can analyze the system with you.

FWIW the first place I would look would be the messages.log which would point to next investigative steps. Alexander's IRIShung suggestion is also a good one.