Article
Oliver Wilms · Aug 22 2m read

My opinion: IRIS Mirror not as reliable as expected in AWS Elastic Container Service

I have described my efforts to optimize IRIS Mirror deployment in AWS ElasticContainer Service (ECS) in my prior article.

IRIS Mirror in the cloud (AWS) | InterSystems Developer Community | AWS
 

I have come to the opinion that IRIS Mirror is not as reliable as needed when deployed in ECS. The root of the problem is the fact that ECS randomly assigns one of the available IP addresses to each EC2 host or Fargate task it starts.

 

These get stored in iris.cpf file in MapMirrors section as shown here:

[MapMirrors.IRISMIRROR]

FAILOVER1=10.2ab.1cd.146,2188,,10.2ab.1cd.146,588E6700-DAB7-11EB-9111-0242AC110003,/failover1/iconfig/,0,10.2ab.1cd.146,51773,,0,,,0

FAILOVER2=10.2ab.1cd.168,2188,,10.2ab.1cd.168,B67866E8-DAB9-11EB-A42C-0A58A9FEAC02,/failover2/iconfig/,0,10.2ab.1cd.168,51773,,0,,,0

 

To enable IRIS Mirror Manager to communicate between failover members I first added code to ZSTU startup routine to update IP addresses when IRIS starts. I obtain the current IP addresses from files that are updated in container entrypoint script.

 

This worked until this happened:

 

I had two tasks running on ip.133 (failover2) and ip.168  (failover1)

 

I updated my code and proceeded to test it by stopping both tasks so ECS would start two new tasks.

 

The result was new task ip.168 using failover2 volume became Primary and new task ip.146 running on failover1 volume was Stopped (Mirror Status).

 

 

mbkmir1.log

 

08/17/21-13:24:01:380 (849) 0 [Utility.Event] Instance 'IRIS' starting on node ip-10-2ab-1cd-146.us-gov-west-1.compute.internal by user irisuser

 

08/17/21-13:24:02:236 (849) 2 [Utility.Event] System appears to have failed over from node ip-10-2ab-1cd-168.us-gov-west-1.compute.internal

 

08/17/21-13:24:03:781 (857) 2 [Utility.Event] Mirroring not started, this instance appears to have been copied. See ^MIRROR

 

mbkmir2.log

 

08/17/21-13:24:01:426 (855) 0 [Utility.Event] Instance 'IRIS' starting on node ip-10-2ab-1cd-168.us-gov-west-1.compute.internal by user irisuser

 

08/17/21-13:24:01:926 (855) 2 [Utility.Event] System appears to have failed over from node ip-10-2ab-1cd-133.us-gov-west-1.compute.internal

 

 

My last attempt to solve this problem was to have code in entrypoint script to update failover IP addresses in iris.cpf file before IRIS starts. Even with this I still see the “appears to have failed over from…” message.

 

 

08/21/21-03:53:09:791 (794) 2 [Utility.Event] System appears to have failed over from node ip-10-2ab-1cd-170.us-gov-west-1.compute.internal

 

If I cannot be assured that I will have one Primary and one Backup, I do not consider IRIS Mirror reliable, so maybe it is just IIS Mirror?

 

80
1 0 0 133
Log in or sign up to continue