Question
· Feb 23, 2021
Logging for "Dead" Processes

Today we had an issue where a couple of our IRIS "processes" had a status of "Dead".

How do I configure IRIS so that, when that happens, IRIS will log an event to messages.log or alerts.log?

We are forwarding the contents of both of those logs to Splunk for analysis, and I'd like to be able to see those events in there as well.

0 7
0 313
Question
· Feb 17, 2021
Detect Hung IRIS Instance

Is there an API that can be used to remotely fetch the state of an IRIS instance similar to what is reported with "iris list"? Specifically, I'm looking to be able to remotely detect when an IRIS instance is in "hung" state.

I know there is the "iris_system_state" metric that I can consume in Grafana, but I found that when our system was in a "hung" state, there was just no data for that metric and I'd think if you alert on that, you cannot be sure if the IRIS system is down or hung or there's just an issue with prometheus fetching stats from IRIS.

0 1
0 353