#High Availability

5 Followers · 52 Posts

High availability (HA) refers to the goal of keeping a system or application operational and available to users a very high percentage of the time, minimizing both planned and unplanned downtime.

Documentation.

Article Myles Collins · Jul 22, 2025 6m read

Are you familiar with SQL databases, but not familiar with IRIS?  Then read on...

About a year ago I joined InterSystems, and that is how IRIS got on my radar.  I've been using databases for over 40 years—much of that time for database vendors—and assumed IRIS would be largely the same as the other databases I knew.  However I was surprised to find that IRIS is in several ways quite unlike other databases, often much better.  With this, my first article in the Dev Community, I'll give a high-level overview of IRIS for people that are already familiar with the other databases such as Oracle, SQL Server, Snowflake, PostgeSQL, etc.   I hope I can make things clearer and simpler for you and save you some time getting started.

2
3 460
Article sween · Nov 12, 2025 9m read

Power your IrisCluster serviceTemplate with kube-vip

If you're running IRIS in a mirrored IrisCluster for HA in Kubernetes, the question of providing a Mirror VIP (Virtual IP) becomes relevant. Virtual IP offers a way for downstream systems to interact with IRIS using one IP address. Even when a failover happens, downstream systems can reconnect to the same IP address and continue working.

The lead in above was stolen (gaffled, jacked, pilfered) from techniques shared to the community for vips across public clouds with IRIS by @Eduard Lebedyuk ...

Articles: ☁ vip-aws | vip-gcp | vip-azure

This version strives to solve the same challenges for IRIS on Kubernetes when being deployed via MAAS, on prem, and possibly yet to be realized using cloud mechanics with Manged Kubernetes Services.  

1
0 101
Article Ariel Glikman · Jan 13, 2025 3m read

You may have noticed that to configure a mirror for InterSystems IRIS for Health and HealthShare® Health Connect there is a special requirement. I wanted to go through it step by step in this article.

This assumes you have already configured the second failover member and confirmed a successful failover member status in the mirror monitor:

Step 1: Enable HS_Services user (on backup and primary

 

Step 2: Switch to Namespace HSSYS and go to Interoperability > Configure > Credentials.

5
6 566
Article Ariel Glikman · Sep 16, 2025 14m read

One of the recommendations when deploying InterSystems Technologies for production is to set up High Availability. The recommended API Manager for these InterSystems Technologies is the InterSystems API Manager (IAM). IAM (essentially Kong Gateway) has multiple deployment topologies.

If you are looking for high availability you could use:

a) Kong Traditional Mode: Multiple Node Clusters

b) Hybrid Mode

c) DB-less Mode

Before we break them down let's first understand the out of the box deployment that is provided by InterSystems: Installing IAM Version 3.10.

2
3 228
Article Sam Ferguson · May 9, 2025 10m read

Regardless of whether an instance of IRIS is in the cloud or not, high availability and disaster recovery are always important considerations. While IKO already allows for the use of NodeSelectors to enforce the scheduling of IRISCluster nodes across multiple zones, multi-region k8s clusters are generally not recommended or even supported in the major CSP's managed Kubernetes solutions. However, when discussing HA and DR for IRIS, we may want to have an async member in a completely separate region, or even in a different cloud provider altogether.

0
4 332
Article Roy Leonov · Mar 12, 2024 5m read

As an IT and cloud team manager with 18 years of experience with InterSystems technologies, I recently led our team in the transformation of our traditional on-premises ERP system to a cloud-based solution. We embarked on deploying InterSystems IRIS within a Kubernetes environment on AWS EKS, aiming to achieve a scalable, performant, and secure system. Central to this endeavor was the utilization of the AWS Application Load Balancer (ALB) as our ingress controller.

3
9 770
Article Ariel Glikman · Feb 2, 2025 3m read

All pods are assigned a Quality of Service (QoS). These are 3 levels of priority pods are assigned within a node.

The levels are as following:

1) Guaranteed: High Priority

2) Burstable: Medium Priority

3) BestEffort: Low Priority

It is a way of telling the kubelet what your priorities are on a certain node if resources need to be reclaimed. This great GIF below by Anvesh Muppeda explains it.

If resources need to be freed, firstly pods with Best Effort QoS will be evicted, then those with Burstable, and finally those with Guaranteed.

0
2 389
Article Anton Umnikov · Jan 21, 2021 26m read

In this article, we’ll build a highly available IRIS configuration using Kubernetes Deployments with distributed persistent storage instead of the “traditional” IRIS mirror pair. This deployment would be able to tolerate infrastructure-related failures, such as node, storage and Availability Zone failures. The described approach greatly reduces the complexity of the deployment at the expense of slightly extended RTO.

16
8 4001
Article Mark Bolinsky · Feb 12, 2019 32m read

The Amazon Web Services (AWS) Cloud provides a broad set of infrastructure services, such as compute resources, storage options, and networking that are delivered as a utility: on-demand, available in seconds, with pay-as-you-go pricing. New services can be provisioned quickly, without upfront capital expense. This allows enterprises, start-ups, small and medium-sized businesses, and customers in the public sector to access the building blocks they need to respond quickly to changing business requirements.

 

Updated: 10-Jan, 2023 

3
12 7137
Article Luis Angel Pérez Ramos · Apr 25, 2023 12m read

A common need for our customers is to configure both HealthShare HealthConnect and IRIS in high availability mode.

It's common for other integration engines on the market to be advertised as having "high availability" configurations, but that's not really true. In general, these solutions work with external databases and therefore, if these are not configured in high availability, when a database crash occurs or the connection to it is lost, the entire integration tool it becomes unusable.

4
4 999
Article Bob Binstock · Sep 6, 2016 19m read

Mirroring 101

Caché mirroring is a reliable, inexpensive, and easy to implement high availability and disaster recovery solution for Caché and Ensemble-based applications. Mirroring provides automatic failover under a broad range of planned and unplanned outage scenarios, with application recovery time typically limited to seconds. Logical data replication eliminates storage as a single point of failure and a source of data corruption. Upgrades can be executed with little or no downtime.

22
3 7884
Article Alex Woodhead · Jan 28, 2023 3m read

Some Usage cases

1. A deployment may consist of two high availability instances and two disaster recovery instances in a different data center.

The corresponding UAT environment could replicate this giving a total of 8 instances. How do you confirm CPF and Scheduled task alignment across ALL instances.

2. Another team (possibly in anther organization) makes changes to an IRIS instance to correct a problem, improve security, or modify shared system task configuration. Capture the CPF before and after to see what was done across instances.

2
0 543
Article Pete Greskoff · Jun 27, 2018 8m read

NB. Please be advised that PKI is not intended to produce certificates for secure production systems. You should make alternate arrangements to create certificates for your productions.
NB. PKI is deprecated as of IRIS 2024.1: documentation and announcement.

In this post, I am going to detail how to set up a mirror using SSL, including generating the certificates and keys via the Public Key Infrastructure built in to InterSystems IRIS Data Platform. I did a similar post in the past for Caché, so feel free to check that out here if you are not running InterSystems IRIS. Much like the original, the goal of this is to take you from new installations to a working mirror with SSL, including a primary, backup, and DR async member, along with a mirrored database. I will not go into security recommendations or restricting access to the files. This is meant to just simply get a mirror up and running. Example screenshots are taken on a 2018.1.1 version of IRIS, so yours may look slightly different.

3
4 1843
Article Jose-Tomas Salvador · Dec 30, 2021 1m read

For those that, at some point, need to test what means that of ECP for horizontal escalability (computing power and/or users and processes concurrency), but they're lazy o have no much time to build the environment, configure the server nodes, etc..., I've just published in Open Exchange the app/sample OPNEx-ECP Deployment .

0
0 348
Article Oliver Wilms · Aug 22, 2021 2m read

I have described my efforts to optimize IRIS Mirror deployment in AWS ElasticContainer Service (ECS) in my prior article.

IRIS Mirror in the cloud (AWS) | InterSystems Developer Community | AWS
 

I have come to the opinion that IRIS Mirror is not as reliable as needed when deployed in ECS. The root of the problem is the fact that ECS randomly assigns one of the available IP addresses to each EC2 host or Fargate task it starts.

 

These get stored in iris.cpf file in MapMirrors section as shown here:

[MapMirrors.IRISMIRROR]

FAILOVER1=10.2ab.1cd.146,2188,,10.2ab.1cd.

0
0 475
Article Oliver Wilms · Aug 4, 2021 3m read

I have been working on redesigning a Health Connect production which runs on a mirrored instance of Healthshare 2019. We were told to take advantage of containers. We got to work on IRIS 2020.1 and split the database part from the Interoperability part. We had the IRIS mirror running on EC2 instances and used containers to run IRIS interoperability application. Eventually we decided to run the data tier in containers as well. Later we switched from using EC2 instances to Fargate “server-less” compute.

14
2 1013
Article Anton Umnikov · May 31, 2021 6m read

All source code to the article is available at: https://github.com/antonum/ha-iris-k8s 

In the previous article, we discussed how to set up IRIS on k8s cluster with high availability, based on the distributed storage, instead of traditional mirroring. As an example, that article used the Azure AKS cluster. In this one, we'll continue to explore highly available configurations on k8s. This time, based on Amazon EKS (AWS managed Kubernetes service) and would include an option for doing database backup and restore, based on Kubernetes Snapshot.

Installation

Let's get right to business.

2
0 1581
Article Mark Bolinsky · Oct 12, 2018 31m read

Google Cloud Platform (GCP) provides a feature rich environment for Infrastructure-as-a-Service (IaaS) as a cloud offering fully capable of supporting all of InterSystems products including the latest InterSystems IRIS Data Platform. Care must be taken, as with any platform or deployment model, to ensure all aspects of an environment are considered such as performance, availability, operations, and management procedures.  Specifics of each of those areas will be covered in this article.

0
3 4735
Article Mark Bolinsky · Jul 10, 2018 4m read

Often InterSystems technology architect team is asked about recommended storage arrays or storage technologies.  To provide this information to a wider audience as reference, a new series is started to provide some of the results we have encountered with various storage technologies.  As a general recommendation, all-flash storage is highly recommended with all InterSystems products to provide the lowest latency and predictable IOPS capabilities.

The first in the series was the most recently tested Netapp AFF A300 storage array.  This is middle-tier type storage array with several higher models above it.  This specific A300 model is capable of supporting a minimal configuration of only a few drives to hundreds of drives per HA pair, and also capable of being clustered with multiple controller pairs for tens of PB's of disk capacity and hundreds of thousands of IOPS or higher. 

0
0 3542
Article Mark Bolinsky · Mar 18, 2016 9m read

++ Update: August 1, 2018

The use of the InterSystems Virtual IP (VIP) address built-in to Caché database mirroring has certain limitations. In particular, it can only be used when mirror members reside the same network subnet. When multiple data centers are used, network subnets are not often “stretched” beyond the physical data center due to added network complexity (more detailed discussion here). For similar reasons, Virtual IP is often not usable when the database is hosted in the cloud.

Network traffic management appliances such as load balancers (physical or virtual) can be used to achieve the same level of transparency, presenting a single address to the client applications or devices. The network traffic manager automatically redirects clients to the current mirror primary’s real IP address. The automation is intended to meet the needs of both HA failover and DR promotion following a disaster. 

12
6 6986
Article Pete Greskoff · Jan 10, 2017 9m read

NB. Please be advised that PKI is not intended to produce certificates for secure production systems. You should make alternate arrangements to create certificates for your productions.
NB. PKI is deprecated as of IRIS 2024.1: documentation and announcement.

In this post, I am going to detail how to set up a mirror using SSL, including generating the certificates and keys via the Public Key Infrastructure built in to Caché. The goal of this is to take you from new installations to a working mirror with SSL, including a primary, backup, and DR async member, along with a mirrored database.

7
0 2783
Article Bob Binstock · Sep 7, 2016 6m read

Mirror Outage Procedures

Caché mirroring is a reliable, inexpensive and easy to implement high availability and disaster recovery solution for Caché and Ensemble-based applications. This article provides an overview of recommended procedures for dealing with a variety of planned and unplanned mirror outage scenarios. (For detailed information about mirroring and a wide range of mirror-related procedures, see Mirroring 101.)

A Caché mirror typically consists of two Caché instances on physically independent hosts, called failover members.

0
1 1235
Article Mark Bolinsky · Jul 1, 2016 17m read

++Update: August 2, 2018

This article provides a reference architecture as a sample for providing robust performing and highly available applications based on InterSystems Technologies that are applicable to Caché, Ensemble, HealthShare, TrakCare, and associated embedded technologies such as DeepSee, iKnow, Zen and Zen Mojo.

Azure has two different deployment models for creating and working with resources: Azure Classic and Azure Resource Manager. The information detailed in this article is based on the Azure Resource Manager model (ARM).

4
0 12656
Article Pete Greskoff · Apr 7, 2016 1m read

Presenters: Pete Greskoff, Sebastian Musielak
Task: Ensure high availability of your HealthShare deployments
Approach: Discuss high-availability options and focus on HealthShare’s new support for database mirroring
 

With the new release of HealthShare, Mirroring is now support for high availability. This session will describe high availability options and focus on mirroring your HealthShare deployments.

 

Content related to this session, including slides, video and additional learning content can be found here.

0
0 403
Article Ray Fucillo · Apr 7, 2016 1m read

Presenter: Ray Fucillo
Task: Provide high availability (HA) and disaster recovery (DR) in diverse architectures that demand high performance, including replication over long distances
Approach: Give examples of mirror architectures in disparate environments, including geographically separated systems. Discuss performance considerations and advances in InterSystems’ mirroring technology
 

In this session you will learn about deploying Mirroring to provide HA and DR in diverse architectures that demand high performance and throughput. Challenges and solutions to achieving high throughput will be covered along with mirror architectures that involve long distances and disparate environments.

 

Content related to this session, including slides, video and additional learning content can be found here.

0
0 330
Article Mark Bolinsky · Apr 7, 2016 1m read

Presenter: Mark Bolinsky
Task: Decide whether a converged infrastructure is ideal for your enterprise applications
Approach: Discuss best practices and provide guidance on the right questions to ask
 

The traditional use of “SAN storage” is no longer the only choice for deploying enterprise application. Software defined data centers are making inroads into enterprise data centers, and there is good reason for it. There is the potential for significant infrastructure cost savings, architecture simplification, reduced administration costs, and depending on the configuration - even better performance. This session will discuss some best practices and outline decision guidance to help you ask the right questions when considering hyper-converged architectures.

 

Content related to this session, including slides, video and additional learning content can be found here.

0
0 357
Article Murray Oldfield · Apr 7, 2016 1m read

Presenter: Murray Oldfield
Task: Deploy applications based on InterSystems’ technology using VMware.
Approach: Provide a checklist of factors to consider, particularly when deploying a production database application that requires high availability
 

Are you ready to deploy your applications on a virtualized architecture? This talk will highlight what you need to plan and do when deploying applications built on ISC data platforms using VMware. Special focus on what you need to know when planning for highly available (HA) production database applications.

 

Content related to this session, including slides, video and additional learning content can be found here.

0
0 404
Article Developer Community Admin · Oct 21, 2015 1m read

Introduction

This document is intended to provide a survey of various High Availability (HA) strategies that can be used in conjunction with InterSystems Caché, Ensemble, and HealthShare Foundation. This document also provides an overview of the various types of system outages that can occur, as well as how each strategy would handle a given outage, with the goal of helping you choose the right strategy for your specific deployment.

0
0 359