Matthew Baron · Feb 17, 2021

Detect Hung IRIS Instance

Is there an API that can be used to remotely fetch the state of an IRIS instance similar to what is reported with "iris list"?  Specifically, I'm looking to be able to remotely detect when an IRIS instance is in "hung" state.

I know there is the "iris_system_state" metric that I can consume in Grafana, but I found that when our system was in a "hung" state, there was just no data for that metric and I'd think if you alert on that, you cannot be sure if the IRIS system is down or hung or there's just an issue with prometheus fetching stats from IRIS.


# iris list

Configuration 'IRIS'   (default)
directory:    /irisapp
versionid:    2020.
datadir:      /irisapp
conf file:    iris.cpf  (SuperServer port = 51773, WebServer = 52773)
status:       running, since Wed Feb 17 10:34:45 2021
state:        alert
product:      InterSystems IRISHealth

$ZV: IRIS for UNIX (Red Hat Enterprise Linux for x86-64) 2020.1 (Build 217_1U) Tue May 26 2020 20:57:10 EDT [Health:2.1.0+r1]
Product version:
IRIS 2020.1
2 0 1 47


Calling into IRIS won't work if the instance is hung, so the only way to detect that is something external to the instance. Take a look at 'iris qlist'. You can get more information from 'iris help qlist', but here are the basics:

        iris qlist
        Quick list InterSystems IRIS registry information for all instances, in a format suitable for parsing in command scripts.