Question
· Feb 1, 2021

IKO (Iris Kubernetes Operator) Service not redirecting dynamically to the correct Pod

In the context of IKO (Iris Kubernetes Operator) the question of Service not redirecting dynamically to the correct Pod is still pending.
In production this can be dangerous since an overload (or any other simpler problem) can cause you to change the main Pod and leave the application inoperable until we intervene.

Intersystems support warned that this is still an issue of IKO, but there are some possibilities that I am studying.

To explore an idea I had, I would like the help of this Forum to answer the following question:

Is there a way, via the non-iterative command line, to detect whether the current Iris instance is a primary member of Mirror or a backup?

I imagine it would be enough if I could detect whether a particular database is write-protected or not ...

I have access to the Linux terminal of the host machine with the "iris", "irissession" and other applications present or that can be installed.

Product version: IRIS 2020.3
Discussion (3)0
Log in or sign up to continue

I had this idea about using the "ReadinessProbe" of the pod to define which one should be used by the service. Like

        readinessProbe:
          exec:
            command:
            - wget
            - -qO
            - /dev/null
            - http://127.0.0.1:52773/csp/bin/mirror_status.cxw
          failureThreshold: 3
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 10

So in a mirror the backup member will not be ready and the main member will be ready and receiving service connections. 

The problem here is that I would have to manually change the service's selector and chose a better Rolling Update strategy for the STS (since new updated/recreated pods would never became "ready" without the main member dying first).

Well, that was just an idea...

Hi Jairton,

Usually, for applications with high multiuser query volume or with high volume data ingestion concurrently with high query volume, it is recommended the use of a compute node (see here for more details).

The application would connect to the compute nodes (via JDBC, for example) and the compute nodes connect to data nodes through ECP (Enterprise Cache Protocol), which is "mirror aware" and therefore knows which data node is active or passive.

You can connect each application to their own set of compute nodes, having a K8s service to load balance this set of compute nodes and taking advantage of the cache from similar queries (see here for more details).

However, if you need to connect directly to the data nodes, I followed @Bob Kuszewski approach of deploying another pod to check the mirror status of every IRIS mirror pair on a periodic basis and then set the intersystems.com/mirrorRole label to primary or backup accordingly, on the data nodes. Apparently it worked smoothly.

You can find all information here: https://github.com/kuszewski/iris-k3s/tree/main/mirror-labeler

Just remembering that this is still an unofficial code/workaround.
If you want, I'm happy to share with you the yaml that I used, containing everything from setting up IRIS Cluster to deploying the mirror labeler together with RBAC and the K8s service itself.

I'll set up a meeting with you to further discuss the options here.