Search

Clear filter
Article
Mark Bolinsky · Oct 12, 2018

InterSystems IRIS Example Reference Architectures for Google Cloud Platform (GCP)

Google Cloud Platform (GCP) provides a feature rich environment for Infrastructure-as-a-Service (IaaS) as a cloud offering fully capable of supporting all of InterSystems products including the latest InterSystems IRIS Data Platform. Care must be taken, as with any platform or deployment model, to ensure all aspects of an environment are considered such as performance, availability, operations, and management procedures. Specifics of each of those areas will be covered in this article. The following overview and details are provided by Google and can be found here. Overview GCP Resources GCP consists of a set of physical assets, such as computers and hard disk drives, and virtual resources, such as virtual machines (VMs), that are contained in Google's data centers around the globe. Each data center location is in a global region. Each region is a collection of zones, which are isolated from each other within the region. Each zone is identified by a name that combines a letter identifier with the name of the region. This distribution of resources provides several benefits, including redundancy in case of failure and reduced latency by locating resources closer to clients. This distribution also introduces some rules about how resources can be used together. Accessing GCP Resources In cloud computing physical hardware and software become services. These services provide access to the underlying resources. When you develop your InterSytems IRIS-based application on GCP, you mix and match these services into combinations that provide the infrastructure you need, and then add your code to enable the scenarios you want to build. Details of the available services can be found here. Projects Any GCP resources that you allocate and use must belong to a project. A project is made up of the settings, permissions, and other metadata that describe your applications. Resources within a single project can work together easily, for example by communicating through an internal network, subject to the regions-and-zones rules. The resources that each project contains remain separate across project boundaries; you can only interconnect them through an external network connection. Interacting with Services GCP gives you three basic ways to interact with the services and resources. Console The Google Cloud Platform Console provides a web-based, graphical user interface that you can use to manage your GCP projects and resources. When you use the GCP Console, you create a new project, or choose an existing project, and use the resources that you create in the context of that project. You can create multiple projects, so you can use projects to separate your work in whatever way makes sense for you. For example, you might start a new project if you want to make sure only certain team members can access the resources in that project, while all team members can continue to access resources in another project. Command-line Interface If you prefer to work in a terminal window, the Google Cloud SDK provides the gcloud command-line tool, which gives you access to the commands you need. The gcloud tool can be used to manage both your development workflow and your GCP resources. gcloud reference details can be found here. GCP also provides Cloud Shell, a browser-based, interactive shell environment for GCP. You can access Cloud Shell from the GCP console. Cloud Shell provides: A temporary Compute Engine virtual machine instance. Command-line access to the instance from a web browser. A built-in code editor. 5 GB of persistent disk storage. Pre-installed Google Cloud SDK and other tools. Language support for Java, Go, Python, Node.js, PHP, Ruby and .NET. Web preview functionality. Built-in authorization for access to GCP Console projects and resources. Client Libraries The Cloud SDK includes client libraries that enable you to easily create and manage resources. GCP client libraries expose APIs for two main purposes: App APIs provide access to services. App APIs are optimized for supported languages, such as Node.js and Python. The libraries are designed around service metaphors, so you can work with the services more naturally and write less boilerplate code. The libraries also provide helpers for authentication and authorization. Details can be found here. Admin APIs offer functionality for resource management. For example, you can use admin APIs if you want to build your own automated tools. You also can use the Google API client libraries to access APIs for products such as Google Maps, Google Drive, and YouTube. Details of GCP client libraries can be found here. InterSystems IRIS Sample Architectures As part of this article, sample InterSystems IRIS deployments for GCP are provided as a starting point for your application specific deployment. These can be used as a guideline for numerous deployment possibilities. This reference architecture demonstrates highly robust deployment options starting with the smallest deployments to massively scalable workloads for both compute and data requirements. High availability and disaster recovery options are covered in this document along with other recommended system operations. It is expected these will be modified by the individual to support their organization’s standard practices and security policies. InterSystems is available for further discussions or questions of GCP-based InterSystems IRIS deployments for your specific application. Sample Reference Architectures The following sample architectures will provide several different configurations with increasing capacity and capabilities. Consider these examples of small development / production / large production / production with sharded cluster that show the progression from starting with a small modest configuration for development efforts and then growing to massively scalable solutions with proper high availability across zones and multi-region disaster recovery. In addition, an example architecture of using the new sharding capabilities of InterSystems IRIS Data Platform for hybrid workloads with massively parallel SQL query processing. Small Development Configuration In this example, a minimal configuration is used to illustrates a small development environment capable of supporting up to 10 developers and 100GB of data. More developers and data can easily be supported by simply changing the virtual machine instance type and increasing storage of the persistent disks as appropriate. This is adequate to support development efforts and become familiar with InterSystems IRIS functionality along with Docker container building and orchestration if desired. High availability with database mirroring is typically not used with a small configuration, however it can be added at any time if high availability is needed. Small Configuration Sample Diagram The below sample diagram in Figure 2.1.1-a illustrates the table of resources in Figure 2.1.1-b. The gateways included are just examples, and can be adjusted accordingly to suit your organization’s standard network practices. The following resources within the GCP VPC Project are provisioned as a minimum small configuration. GCP resources can be added or removed as required. Small Configuration GCP Resources Sample of Small Configuration GCP resources is provided below in the following table. Proper network security and firewall rules need to be considered to prevent unwanted access into the VPC. Google provides network security best practices for getting started which can be found here. Note: VM instances require a public IP address to reach GCP services. While this practice might raise some concerns, Google recommends limiting the incoming traffic to these VM instances by using firewall rules. If your security policy requires truly internal VM instances, you will need to set up a NAT proxy manually on your network and a corresponding route so that the internal instances can reach the Internet. It is important to note that you cannot connect to a fully internal VM instance directly by using SSH. To connect to such internal machines, you must set up a bastion instance that has an external IP address and then tunnel through it. A bastion Host can be provisioned to provide the external facing point of entry into your VPC. Details of bastion hosts can he found here. Production Configuration In this example, a more sizable configuration as an example production configuration that incorporates InterSystems IRIS database mirroring capability to support high availability and disaster recovery. Included in this configuration is a synchronous mirror pair of InterSystems IRIS database servers split between two zones within region-1 for automatic failover, and a third DR asynchronous mirror member in region-2 for disaster recovery in the unlikely event an entire GCP region is offline. The InterSystems Arbiter and ICM server deployed in a separate third zone for added resiliency. The sample architecture also includes a set of optional load balanced web servers to support a web-enabled application. These web servers with the InterSystems Gateway can be scaled independently as needed. Production Configuration Sample Diagram The below sample diagram in Figure 2.2.1-a illustrates the table of resources found in Figure 2.2.1-b. The gateways included are just examples, and can be adjusted accordingly to suit your organization’s standard network practices. The following resources within the GPC VPC Project are recommended as a minimum recommendation to support a sharded cluster. GCP resources can be added or removed as required. Production Configuration GCP Resources Sample of Production Configuration GCP resources is provided below in the following tables. Large Production Configuration In this example, a massively scaled configuration is provided by expanding on the InterSystems IRIS capability to also introduce application servers using InterSystems’ Enterprise Cache Protocol (ECP) to provide massive horizontal scaling of users. An even higher level of availability is included in this example because of ECP clients preserving session details even in the event of a database instance failover. Multiple GCP zones are used with both ECP-based application servers and database mirror members deployed in multiple regions. This configuration is capable of supporting tens of millions database accesses per second and multiple terabytes of data. Large Production Configuration Sample Diagram The sample diagram in Figure 2.3.1-a illustrates the table of resources in Figure 2.3.1-b. The gateways included are just examples, and can be adjusted accordingly to suit your organization’s standard network practices. Included in this configuration is a failover mirror pair, four or more ECP clients (application servers), and one or more web servers per application server. The failover database mirror pairs are split between two different GCP zones in the same region for fault domain protection with the InterSystems Arbiter and ICM server deployed in a separate third zone for added resiliency. Disaster recovery extends to a second GCP region and zone(s) similar to the earlier example. Multiple DR regions can be used with multiple DR Async mirror member targets if desired. The following resources within the GPC VPC Project are recommended as a minimum recommendation to support a large production deployment. GCP resources can be added or removed as required. Large Production Configuration GCP Resources Sample of Large Production Configuration GCP resources is provided below in the following tables. Production Configuration with InterSystems IRIS Sharded Cluster In this example, a horizontally scaled configuration for hybrid workloads with SQL is provided by including the new sharded cluster capabilities of InterSystems IRIS to provide massive horizontal scaling of SQL queries and tables across multiple systems. Details of InterSystems IRIS sharded cluster and its capabilities are discussed later in this article. Production Configuration with InterSystems IRIS Sharded Cluster The sample diagram in Figure 2.4.1-a illustrates the table of resources in Figure 2.4.1-b. The gateways included are just examples, and can be adjusted accordingly to suit your organization’s standard network practices. Included in this configuration are four mirror pairs as the data nodes. Each of the failover database mirror pairs are split between two different GCP zones in the same region for fault domain protection with the InterSystems Arbiter and ICM server deployed in a separate third zone for added resiliency. This configuration allows for all the database access methods to be available from any data node in the cluster. The large SQL table(s) data is physically partitioned across all data nodes to allow for massive parallelization of both query processing and data volume. Combining all these capabilities provides the ability to support complex hybrid workloads such as large-scale analytical SQL querying with concurrent ingestion of new data, all within a single InterSystems IRIS Data Platform. Note that in the above diagram and the “resource type” column in the table below, the term “Compute [Engine]” is a GCP term representing a GCP (virtual) server instance as described further in section 3.1 of this document. It does not represent or imply the use of “compute nodes” in the cluster architecture described later in this article. The following resources within the GPC VPC Project are recommended as a minimum recommendation to support a sharded cluster. GCP resources can be added or removed as required. Production with Sharded Cluster Configuration GCP Resources Sample of Sharded Cluster Configuration GCP resources is provided below in the following table. Introduction of Cloud Concepts Google Cloud Platform (GCP) provides a feature rich cloud environment for Infrastructure-as-a-Service (IaaS) fully capable of supporting all of InterSystems products including support for container-based DevOps with the new InterSystems IRIS Data Platform. Care must be taken, as with any platform or deployment model, to ensure all aspects of an environment are considered such as performance, availability, system operations, high availability, disaster recovery, security controls, and other management procedures. This document will cover the three major components of all cloud deployments: Compute, Storage, and Networking. Compute Engines (Virtual Machines) Within GCP there are several options available for compute engine resources with numerous virtual CPU and memory specifications and associated storage options. One item to note within GCP, references to the number of vCPUs in a given machine type equates to one vCPU is one hyper-thread on the physical host at the hypervisor layer. For the purposes of this document n1-standard* and n1-highmem* instance types will be used and are most widely available in most GCP deployment regions. However, the use of n1-ultramem* instance types are great options for very large working datasets keeping massive amounts of data cached in memory. Default instance settings such as Instance Availability Policy or other advanced features are used except where noted. Details of the various machine types can be found here. Disk Storage The storage type most directly related to InterSystems products are the persistent disk types, however local storage may be used for high levels of performance as long as data availability restrictions are understood and accommodated. There are several other options such as Cloud Storage (buckets), however those are more specific to an individual application’s requirements rather than supporting the operation of InterSystems IRIS Data Platform. Like most other cloud providers, GCP imposes limitations on the amount of persistent storage that can be associated to an individual compute engine. These limits include the maximum size of each disk, the number of persistent disks attached to each compute engine, and the amount of IOPS per persistent disk with an overall individual compute engine instance IOPS cap. In addition, there are imposed IOPS limits per GB of disk space, so at times provisioning more disk capacity is required to achieve desired IOPS rate. These limits may change over time and to be confirmed with Google as appropriate. There are two types of persistent storage types for disk volumes: Standard Persistent and SSD Persistent disks. SSD Persistent disks are more suited for production workloads that require predictable low-latency IOPS and higher throughput. Standard Persistent disks are more an economical option for non-production development and test or archive type workloads. Details of the various disk types and limitations can be found here. VPC Networking The virtual private cloud (VPC) network is highly recommended to support the various components of InterSystems IRIS Data Platform along with providing proper network security controls, various gateways, routing, internal IP address assignments, network interface isolation, and access controls. An example VPC will be detailed in the examples provided within this document. Details of VPC networking and firewalls can be found here. Virtual Private Cloud (VPC) Overview GCP VPC’s are slightly different than other cloud providers allowing for simplicity and greater flexibility. A comparison of concepts can be found here. Within a GCP project, several VPCs per project are allowed (currently a max of 5 per project), and there are two options for creating a VPC network – auto mode and custom mode. Details of each type are provided here. In most large cloud deployments, multiple VPCs are provisioned to isolate the various gateways types from application-centric VPCs and leverage VPC peering for inbound and outbound communications. It is highly recommended to consult with your network administrator for details on allowable subnets and any organizational firewall rules of your company. VPC peering is not covered in this document. In the examples provided in this document, a single VPC with three subnets will be used to provide network isolation of the various components for predictable latency and bandwidth and security isolation of the various InterSystems IRIS components. Network Gateway and Subnet Definitions Two gateways are provided in the example in this document to support both Internet and secure VPN connectivity. Each ingress access is required to have appropriate firewall and routing rules to provide adequate security for the application. Details on how to use routes can be found here. Three subnets are used in the provided example architectures dedicated for use with InterSystems IRIS Data Platform. The use of these separate network subnets and network interfaces allows for flexibility in security controls and bandwidth protection and monitoring for each of the three above major components. Details on the various use cases can be found here. Details for creating virtual machine instances with multiple network interfaces can be found here. The subnets included in these examples: User Space Network for Inbound connected users and queries Shard Network for Inter-shard communications between the shard nodes Mirroring Network for high availability using synchronous replication and automatic failover of individual data nodes. Note: Failover synchronous database mirroring is only recommended between multiple zones which have low latency interconnects within a single GCP region. Latency between regions is typically too high for to provide a positive user experience especially for deployment with a high rate of updates. Internal Load Balancers Most IaaS cloud providers lack the ability to provide for a Virtual IP (VIP) address that is typically used in automatic database failover designs. To address this, several of the most commonly used connectivity methods, specifically ECP clients and Web Gateways, are enhanced within InterSystems IRIS to no longer rely on VIP capabilities making them mirror-aware and automatic. Connectivity methods such as xDBC, direct TCP/IP sockets, or other direct connect protocols, require the use of a VIP-like address. To support those inbound protocols, InterSystems database mirroring technology makes it possible to provide automatic failover for those connectivity methods within GCP using a health check status page called mirror_status.cxw to interact with the load balancer to achieve VIP-like functionality of the load balancer only directing traffic to the active primary mirror member, thus providing a complete and robust high availability design within GCP. Details of using a load balancer to provide VIP-like functionality is provided here. Sample VPC Topology Combining all the components together, the following illustration in Figure 4.3-a demonstrates the layout of a VPC with the following characteristics: Leverages multiple zones within a region for high availability Provides two regions for disaster recovery Utilizes multiple subnets for network segregation Includes separate gateways for both Internet and VPN connectivity Uses cloud load balancer for IP failover for mirror members Persistent Storage Overview As discussed in the introduction, the use of GCP persistent disks is recommended and specifically SSD persistent disk types. SSD persistent disks are recommended due to the higher read and write IOPS rates and low latency required for transactional and analytical database workloads. Local SSDs may be used in certain circumstances, however beware that the performance gains of local SSDs comes with certain trade-offs in availability, durability, and flexibility. Details of Local SSD data persistence can be found here to understand the events of when Local SSD data is preserved and when not. LVM Striping Like other cloud providers, GCP imposes numerous limits on storage both in IOPS, space capacity, and number of devices per virtual machine instance. Consult GCP documentation for current limits which can be found here. With these limits, LVM striping becomes necessary to maximize IOPS beyond that of a single disk device for a database instance. In the example virtual machine instances provided, the following disk layouts are recommended. Performance limits associated with SSD persistent disks can be found here. Note: There is currently a maximum of 16 persistent disks per virtual machine instance although GCP currently lists an increase to 128 is “(beta)” at the moment, so this will be a welcomed enhancement. The benefits of LVM striping allows for spreading out random IO workloads to more disk devices and inherit disk queues. Below is an example of how to use LVM striping with Linux for the database volume group. This example will use four disks in an LVM PE stripe with a physical extent (PE) size of 4MB. Alternatively, larger PE sizes can be used if needed. Step 1: Create Standard or SSD Persistent Disks as needed Step 2: IO scheduler is NOOP for each of the disk devices using “lsblk -do NAME,SCHED” Step 3: Identify disk devices using “lsblk -do KNAME,TYPE,SIZE,MODEL” Step 4: Create Volume Group with new disk devices vgcreate s 4M <vg name> <list of all disks just created> example: vgcreate -s 4M vg_iris_db /dev/sd[h-k] Step 4: Create Logical Volume lvcreate n <lv name> -L <size of LV> -i <number of disks in volume group> -I 4MB <vg name> example: lvcreate -n lv_irisdb01 -L 1000G -i 4 -I 4M vg_iris_db Step 5: Create File System mkfs.xfs K <logical volume device> example: mkfs.xfs -K /dev/vg_iris_db/lv_irisdb01 Step 6: Mount File System edit /etc/fstab with following mount entries /dev/mapper/vg_iris_db-lv_irisdb01 /vol-iris/db xfs defaults 0 0 mount /vol-iris/db Using the above table, each of the InterSystems IRIS servers will have the following configuration with two disks for SYS, four disks for DB, two disks for primary journals and two disks for alternate journals. For growth LVM allows for expanding devices and logical volumes when needed without interruption. Consult with Linux documentation on best practices for ongoing management and expansion of LVM volumes. Note: The enablement of asynchronous IO for both the database and the write image journal files are highly recommend. See the following community article for details on enabling on Linux: https://community.intersystems.com/post/lvm-pe-striping-maximize-hyper-converged-storage-throughput Provisioning New with InterSystems IRIS is InterSystems Cloud Manager (ICM). ICM carries out many tasks and offers many options for provisioning InterSystems IRIS Data Platform. ICM is provided as a Docker image that includes everything for provisioning a robust GCP cloud-based solution. ICM currently support provisioning on the following platforms: Google Cloud Platform (GCP) Amazon Web Services including GovCloud (AWS / GovCloud) Microsoft Azure Resource Manager including Government (ARM / MAG) VMware vSphere (ESXi) ICM and Docker can run from either a desktop/laptop workstation or have a centralized dedicated modest “provisioning” server and centralized repository. The role of ICM in the application lifecycle is Define -> Provision -> Deploy -> Manage Details for installing and using ICM with Docker can be found here. NOTE: The use of ICM is not required for any cloud deployment. The traditional method of installation and deployment with tar-ball distributions is fully supported and available. However, ICM is recommended for ease of provisioning and management in cloud deployments. Container Monitoring ICM includes a basic monitoring facility using Weave Scope for container-based deployment. It is not deployed by default, and needs to be specified in the defaults file using the Monitor field. Details for monitoring, orchestration, and scheduling with ICM can be found here. An overview of Weave Scope and documentation can be found here. High Availability InterSystems database mirroring provides the highest level of availability in any cloud environment. There are options to provide some virtual machine resiliency directly at the instance level. Details of the various policies available in GCP can be found here. Earlier sections discussed how a cloud load balancer will provide automatic IP address failover for a Virtual IP (VIP-like) capability with database mirroring. The cloud load balancer uses the mirror_status.cxw health check status page mentioned earlier in the Internal Load Balancers section. There are two modes of database mirroring - synchronous with automatic failover and asynchronous mirroring. In this example, synchronous failover mirroring will be covered. The details of mirroring can he found here. The most basic mirroring configuration is a pair of failover mirror members in an arbiter-controlled configuration. The arbiter is placed in a third zone within the same region to protect from potential zone outages impacting both the arbiter and one of the mirror members. There are many ways mirroring can be setup specifically in the network configuration. In this example, we will use the network subnets defined previously in the Network Gateway and Subnet Definitions section of this document. Example IP address schemes will be provided in a following section and for the purpose of this section, only the network interfaces and designated subnets will be depicted. Disaster Recovery InterSystems database mirroring extends the capability of high available to also support disaster recovery to another GCP geographic region to support operational resiliency in the unlikely event of an entire GCP region going offline. How an application is to endure such outages depends on the recovery time objective (RTO) and recovery point objectives (RPO). These will provide the initial framework for the analysis required to design a proper disaster recovery plan. The following links provides a guide for the items to be considered when developing a disaster recovery plan for your application. https://cloud.google.com/solutions/designing-a-disaster-recovery-plan and https://cloud.google.com/solutions/disaster-recovery-cookbook Asynchronous Database Mirroring InterSystems IRIS Data Platform’s database mirroring provides robust capabilities for asynchronously replicating data between GCP zones and regions to help support the RTO and RPO goals of your disaster recovery plan. Details of async mirror members can be found here. Similar to the earlier high availability section, a cloud load balancer will provide automatic IP address failover for a Virtual IP (VIP-like) capability for DR asynchronous mirroring as well using the same mirror_status.cxw health check status page mentioned earlier in the Internal Load Balancers section. In this example, DR asynchronous failover mirroring will be covered along with the introduction of the GCP Global Load Balancing service to provide upstream systems and client workstations with a single anycast IP address regardless of which zone or region your InterSystems IRIS deployment is operating. One of the advances of GCP is the load balancer is a software defined global resource and not bound to a given region. This allows for the unique capability to leverage a single service across regions since it is not an instance or device-based solution. Details of GCP Global Load Balancing with Single Anycast IP can be found here. In the above example, the IP addresses of all three InterSystems IRIS instances are provided to the GCP Global Load Balancer, and it will only direct traffic to whichever mirror member is the acting primary mirror regardless of the zone or region it is located. Sharded Cluster InterSystems IRIS includes a comprehensive set of capabilities to scale your applications, which can be applied alone or in combination, depending on the nature of your workload and the specific performance challenges it faces. One of these, sharding, partitions both data and its associated cache across a number of servers, providing flexible, inexpensive performance scaling for queries and data ingestion while maximizing infrastructure value through highly efficient resource utilization. An InterSystems IRIS sharded cluster can provide significant performance benefits for a wide variety of applications, but especially for those with workloads that include one or more of the following: High-volume or high-speed data ingestion, or a combination. Relatively large data sets, queries that return large amounts of data, or both. Complex queries that do large amounts of data processing, such as those that scan a lot of data on disk or involve significant compute work. Each of these factors on its own influences the potential gain from sharding, but the benefit may be enhanced where they combine. For example, a combination of all three factors — large amounts of data ingested quickly, large data sets, and complex queries that retrieve and process a lot of data — makes many of today’s analytic workloads very good candidates for sharding. Note that these characteristics all have to do with data; the primary function of InterSystems IRIS sharding is to scale for data volume. However, a sharded cluster can also include features that scale for user volume, when workloads involving some or all of these data-related factors also experience a very high query volume from large numbers of users. Sharding can be combined with vertical scaling as well. Operational Overview The heart of the sharded architecture is the partitioning of data and its associated cache across a number of systems. A sharded cluster physically partitions large database tables horizontally — that is, by row — across multiple InterSystems IRIS instances, called data nodes, while allowing applications to transparently access these tables through any node and still see the whole dataset as one logical union. This architecture provides three advantages: Parallel processing: Queries are run in parallel on the data nodes, with the results merged, combined, and returned to the application as full query results by the node the application connected to, significantly enhancing execution speed in many cases. Partitioned caching: Each data node has its own cache, dedicated to the sharded table data partition it stores, rather than a single instance’s cache serving the entire data set, which greatly reduces the risk of overflowing the cache and forcing performance-degrading disk reads. Parallel loading: Data can be loaded onto the data nodes in parallel, reducing cache and disk contention between the ingestion workload and the query workload and improving the performance of both. Details of InterSystems IRIS sharded cluster can be found here. Elements of Sharding and Instance Types A sharded cluster consists of at least one data node and, if needed for specific performance or workload requirements, an optional number of compute nodes. These two node types offer simple building blocks presenting a simple, transparent, and efficient scaling model. Data Nodes Data nodes store data. At the physical level, sharded table[1] data is spread across all data nodes in the cluster and non-sharded table data is physically stored on the first data node only. This distinction is transparent to the user with the possible sole exception that the first node might have a slightly higher storage consumption than the others, but this difference is expected to become negligible as sharded table data would typically outweigh non-sharded table data by at least an order of magnitude. Sharded table data can be rebalanced across the cluster when needed, typically after adding new data nodes. This will move “buckets” of data between nodes to approximate an even distribution of data. At the logical level, non-sharded table data and the union of all sharded table data is visible from any node, so clients will see the whole dataset, regardless of which node they’re connecting to. Metadata and code are also shared across all data nodes. The basic architecture diagram for a sharded cluster simply consists of data nodes that appear uniform across the cluster. Client applications can connect to any node and will experience the data as if it were local. [1] For convenience, the term “sharded table data” is used throughout the document to represent “extent” data for any data model supporting sharding that is marked as sharded. The terms “non-sharded table data” and “non-sharded data” are used to represent data that is in a shardable extent not marked as such or for a data model that simply doesn’t support sharding yet. Data Nodes For advanced scenarios where low latencies are required, potentially at odds with a constant influx of data, compute nodes can be added to provide a transparent caching layer for servicing queries. Compute nodes cache data. Each compute node is associated with a data node for which it caches the corresponding sharded table data and, in addition to that, it also caches non-sharded table data as needed to satisfy queries. Because compute nodes don’t physically store any data and are meant to support query execution, their hardware profile can be tailored to suit those needs, for example by emphasizing memory and CPU and keeping storage to the bare minimum. Ingestion is forwarded to the data nodes, either directly by the driver (xDBC, Spark) or implicitly by the sharding manager code when “bare” application code runs on a compute node. Sharded Cluster Illustrations There are various combinations of deploying a sharded cluster. The following high-level diagrams are provided to illustrate the most common deployment models. These diagrams do not include the networking gateways and details and provide to focus only on the sharded cluster components. Basic Sharded Cluster The following diagram is the simplest sharded cluster with four data nodes deployed in a single region and in a single zone. A GCP Cloud Load Balancer is used to distribute client connections to any of the sharded cluster nodes. In this basic model, there is no resiliency or high availability provided beyond that of what GCP provides for a single virtual machine and its attached SSD persistent storage. Two separate network interface adapters are recommended to provide both network security isolation for the inbound client connections and also bandwidth isolation between the client traffic and the sharded cluster communications. Basic Sharded Cluster with High Availability The following diagram is the simplest sharded cluster with four mirrored data nodes deployed in a single region and splitting each node’s mirror between zones. A GCP Cloud Load Balancer is used to distribute client connections to any of the sharded cluster nodes. High availability is provided through the use of InterSystems database mirroring which will maintain a synchronously replicated mirror in a secondary zone within the region. Three separate network interface adapters are recommended to provide both network security isolation for the inbound client connections and bandwidth isolation between the client traffic, the sharded cluster communications, and the synchronous mirror traffic between the node pairs. This deployment model also introduces the mirror arbiter as described in an earlier section of this document. Sharded Cluster with Separate Compute Nodes The following diagram expands the sharded cluster for massive user/query concurrency with separate compute nodes and four data nodes. The Cloud Load Balancer server pool only contains the addresses of the compute nodes. Updates and data ingestion will continue to update directly to the data nodes as before to sustain ultra-low latency performance and avoid interference and congestion of resources between query/analytical workloads from real-time data ingestion. With this model the allocation of resources can be fine-tuned for scaling of compute/query and ingestion independently allowing for optimal resources where needed in a “just-in-time” and maintaining an economical yet simple solution instead of wasting resources unnecessarily just to scale compute or data. Compute Nodes lend themselves for a very straightforward use of GCP auto scale grouping (aka Autoscaling) to allow for automatic addition or deletion of instances from a managed instance group based on increased or decreased load. Autoscaling works by adding more instances to your instance group when there is more load (upscaling), and deleting instances when the need for instances is lowered (downscaling). Details of GCP Autoscaling can be found here. Autoscaling helps cloud-based applications gracefully handle increases in traffic and reduces cost when the need for resources is lower. Simply define the autoscaling policy and the autoscaler performs automatic scaling based on the measured load. Backup Operations There are multiple options available for backup operations. The following three options are viable for your GCP deployment with InterSystems IRIS. The first two options, detailed below, incorporate a snapshot type procedure which involves suspending database writes to disk prior to creating the snapshot and then resuming updates once the snapshot was successful. The following high-level steps are taken to create a clean backup using either of the snapshot methods: Pause writes to the database via database External Freeze API call. Create snapshots of the OS + data disks. Resume database writes via External Thaw API call. Backup facility archives to backup location Details of the External Freeze/Thaw APIs can be found here. Note: Sample scripts for backups are not included in this document, however periodically check for examples posted to the InterSystems Developer Community. www.community.intersystems.com The third option is InterSystems Online backup. This is an entry-level approach for smaller deployments with a very simple use case and interface. However, as databases increase in size, external backups with snapshot technology are recommended as a best practice with advantages including the backup of external files, faster restore times, and an enterprise-wide view of data and management tools. Additional steps such as integrity checks can be added on a periodic interval to ensure clean and consistent backup. The decision points on which option to use depends on the operational requirements and policies of your organization. InterSystems is available to discuss the various options in more detail. GCP Persistent Disk Snapshot Backup Backup operations can be achieved using GCP gcloud command-line API along with InterSystems ExternalFreeze/Thaw API capabilities. This allows for true 24x7 operational resiliency and assurance of clean regular backups. Details for managing and creating and automation GCP Persistent Disk Snapshots can be found here. Logical Volume Manager (LVM) Snapshots Alternatively, many of the third-party backup tools available on the market can be used by deploying individual backup agents within the VM itself and leveraging file-level backups in conjunction with Logical Volume Manager (LVM) snapshots. One of the major benefits to this model is having the ability to have file-level restores of either Windows or Linux based VMs. A couple of points to note with this solution, is since GCP and most other IaaS cloud providers do not provide tape media, all backup repositories are disk-based for short term archiving and have the ability to leverage blob or bucket type low cost storage for long-term retention (LTR). It is highly recommended if using this method to use a backup product that supports de-duplication technologies to make the most efficient use of disk-based backup repositories. Some examples of these backup products with cloud support include but is not limited to: Commvault, EMC Networker, HPE Data Protector, and Veritas Netbackup. Note: InterSystems does not validate or endorses one backup product over the other. The responsibility of choosing a backup management software is up to the individual customer. Online Backup For small deployments the built-in Online Backup facility is also a viable option as well. This InterSystems database online backup utility backs up data in database files by capturing all blocks in the databases then writes the output to a sequential file. This proprietary backup mechanism is designed to cause no downtime to users of the production system. Details of Online Backup can be found here. In GCP, after the online backup has finished, the backup output file and all other files in use by the system must be copied to some other storage location outside of that virtual machine instance. Bucket/Object storage is a good designation for this. There are two option for using a GCP Storage bucket. Use the gcloud scripting APIs directly to copy and manipulate the newly created online backup (and other non-database) files. Details can be found here. Mount a storage bucket as a file system and use it similarly as a persistent disk enough though Cloud Storage buckets are object storage. Details of mounting a Cloud Storage bucket using Cloud Storage FUSE can be found here.
Announcement
Anastasia Dyubaylo · Jul 13, 2023

[Webinar in Hebrew] Introducing VS Code, and Moving from InterSystems Studio

Hi Community, We're pleased to invite you to the upcoming webinar in Hebrew: 👉 Introducing VS Code, and Moving from Studio in Hebrew 👈 🗓️ Date & time: July 25th, 3:00 PM IDT 🗣️ Speaker: @Tani.Frankel, Sales Engineer Manager In this session, we will review using VS Code for InterSystems-based development. It is aimed at beginners of VS Code, but will also cover some areas that might be beneficial for users who are already using VS Code. We will also cover some topics relevant to people moving from InterSystems Studio to VS Code. The session is relevant for users of Caché / Ensemble / InterSystems IRIS Data Platform / InterSystems IRIS for Health / HealthShare Health Connect. ➡️ Register today and enjoy! >>
Announcement
Fabiano Sanches · Jun 21, 2023

Developer preview #4 for InterSystems IRIS, & IRIS for Health 2023.2

InterSystems announces its fourth preview, as part of the developer preview program for the 2023.2 release. This release will include InterSystems IRIS and InterSystems IRIS for Health. Highlights Many updates and enhancements have been added in 2023.2 and there are also brand-new capabilities, such as Time-Aware Modeling, enhancements of Foreign Tables, and the ability to use Ready-Only Federated Tables. Note that some of these features or improvements may not be available in this current developer preview. Another important topic is the removal of the Private Web Server (PWS) from the installers. This feature has been announced since last year and will be removed from InterSystems installers, but they are still in this first preview. See this note in the documentation. --> If you are interested to try the installers without the PWS, please enroll in its EAP using this form, selecting the option "NoPWS". Additional information related to this EAP can be found here. Future preview releases are expected to be updated biweekly and we will add features as they are ready. Please share your feedback through the Developer Community so we can build a better product together. Initial documentation can be found at these links below. They will be updated over the next few weeks until launch is officially announced (General Availability - GA): InterSystems IRIS InterSystems IRIS for Health Availability and Package Information As usual, Continuous Delivery (CD) releases come with classic installation packages for all supported platforms, as well as container images in Docker container format. For a complete list, refer to the Supported Platforms document. Installation packages and preview keys are available from the WRC's preview download site or through the evaluation services website (use the flag "Show Preview Software" to get access to the 2023.2). Container images for both Enterprise and Community Editions of InterSystems IRIS and IRIS for Health and all corresponding components are available from the new InterSystems Container Registry web interface. For additional information about docker commands, please see this post: Announcing the InterSystems Container Registry web user interface. The build number for this developer preview is 2023.2.0.204.0. For a full list of the available images, please refer to the ICR documentation. Alternatively, tarball versions of all container images are available via the WRC's preview download site.
Announcement
Fabiano Sanches · Jul 6, 2023

Developer preview #5 for InterSystems IRIS, & IRIS for Health 2023.2

InterSystems announces its fifth preview, as part of the developer preview program for the 2023.2 release. This release will include InterSystems IRIS and InterSystems IRIS for Health. Highlights Many updates and enhancements have been added in 2023.2 and there are also brand-new capabilities, such as Time-Aware Modeling, and enhancements of Foreign Tables (but still as an experimental feature). Note that some of these features or improvements may not be available in this current developer preview. Another important topic is the removal of the Private Web Server (PWS) from the installers. This feature has been announced since last year and will be removed from InterSystems installers, but they are still in this first preview. See this note in the documentation. --> If you are interested to try the installers without the PWS, please enroll in its EAP using this form, selecting the option "NoPWS". Additional information related to this EAP can be found here. Future preview releases are expected to be updated biweekly and we will add features as they are ready. Please share your feedback through the Developer Community so we can build a better product together. Initial documentation can be found at these links below. They will be updated over the next few weeks until launch is officially announced (General Availability - GA): InterSystems IRIS InterSystems IRIS for Health Availability and Package Information As usual, Continuous Delivery (CD) releases come with classic installation packages for all supported platforms, as well as container images in Docker container format. For a complete list, refer to the Supported Platforms document. Installation packages and preview keys are available from the WRC's preview download site or through the evaluation services website (use the flag "Show Preview Software" to get access to the 2023.2). Container images for both Enterprise and Community Editions of InterSystems IRIS and IRIS for Health and all corresponding components are available from the new InterSystems Container Registry web interface. For additional information about docker commands, please see this post: Announcing the InterSystems Container Registry web user interface. The build number for this developer preview is 2023.2.0.210.0. For a full list of the available images, please refer to the ICR documentation. Alternatively, tarball versions of all container images are available via the WRC's preview download site.
Article
Roy Leonov · Mar 12, 2024

Orchestrating Secure Management Access in InterSystems IRIS with AWS EKS and ALB

As an IT and cloud team manager with 18 years of experience with InterSystems technologies, I recently led our team in the transformation of our traditional on-premises ERP system to a cloud-based solution. We embarked on deploying InterSystems IRIS within a Kubernetes environment on AWS EKS, aiming to achieve a scalable, performant, and secure system. Central to this endeavor was the utilization of the AWS Application Load Balancer (ALB) as our ingress controller. However, our challenge extended beyond the initial cluster and application deployment; we needed to establish an efficient and secure method to manage the various IRIS instances, particularly when employing mirroring for high availability. This post will focus on the centralized management solution we implemented to address this challenge. By leveraging the capabilities of AWS EKS and ALB, we developed a robust architecture that allowed us to effectively manage and monitor the IRIS cluster, ensuring seamless accessibility and maintaining the highest levels of security. In the following sections, we will delve into the technical details of our implementation, sharing the strategies and best practices we employed to overcome the complexities of managing a distributed IRIS environment on AWS EKS. Through this post, we aim to provide valuable insights and guidance to assist others facing similar challenges in their cloud migration journeys with InterSystems technologies. Configuration Summary Our configuration capitalized on the scalability of AWS EKS, the automation of the InterSystems Kubernetes Operator (IKO) 3.6, and the routing proficiency of AWS ALB. This combination provided a robust and agile environment for our ERP system's web services. Mirroring Configuration and Management Access We deployed mirrored IRIS data servers to ensure high availability. These servers, alongside a single application server, were each equipped with a Web Gateway sidecar pod. Establishing secure access to these management portals was paramount, achieved by meticulous network and service configuration. Detailed Configuration Steps Initial Deployment with IKO: We leveraged IKO 3.6, we deployed the IRIS instances, ensuring they adhered to our high-availability requirements. Web Gateway Management Configuration: We create server access profiles within the Web Gateway Management interface. These profiles, named data00 and data01, were crucial in establishing direct and secure connectivity to the respective Web Gateway sidecar pods associated with each IRIS data server. To achieve precise routing of incoming traffic to the appropriate Web Gateway, we utilized the DNS pod names of the IRIS data servers. By configuring the server access profiles with the fully qualified DNS pod names, such as iris-svc.app.data-0.svc.cluster.local and iris-svc.app.data-1.svc.cluster.local, we ensured that requests were accurately directed to the designated Web Gateway sidecar pods. https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=GCGI_config_serv IRIS Terminal Commands: To align the CSP settings with the newly created server profiles, we executed the following commands in the IRIS terminal: d $System.CSP.SetConfig("CSPConfigName","data00") # on data00 d $System.CSP.SetConfig("CSPConfigName","data01") # on data01 https://docs.intersystems.com/healthconnectlatest/csp/docbook/DocBook.UI.Page.cls?KEY=GCGI_remote_csp NGINX Configuration: The NGINX configuration was updated to respond to /data00 and /data01 paths, followed by creating Kubernetes services and ingress resources that interfaced with the AWS ALB, completing our secure and unified access solution. Creating Kubernetes Services: I initiated the setup by creating Kubernetes services for the IRIS data servers and the SAM: Ingress Resource Definition: Next, I defined the ingress resources, which route traffic to the appropriate paths using annotations to secure and manage access. Explanations for the Annotations in the Ingress YAML Configuration: alb.ingress.kubernetes.io/scheme: internal Specifies that the Application Load Balancer should be internal, not accessible from the internet. This ensures that the ALB is only reachable within the private network and not exposed publicly. alb.ingress.kubernetes.io/subnets: subnet-internal, subnet-internal Specifies the subnets where the Application Load Balancer should be provisioned. In this case, the ALB will be deployed in the specified internal subnets, ensuring it is not accessible from the public internet. alb.ingress.kubernetes.io/target-type: ip Specifies that the target type for the Application Load Balancer should be IP-based. This means that the ALB will route traffic directly to the IP addresses of the pods, rather than using instance IDs or other target types. alb.ingress.kubernetes.io/target-group-attributes: stickiness.enabled=true Enables sticky sessions (session affinity) for the target group. When enabled, the ALB will ensure that requests from the same client are consistently routed to the same target pod, maintaining session persistence. alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS": 443}]' Specifies the ports and protocols that the Application Load Balancer should listen on. In this case, the ALB is configured to listen for HTTPS traffic on port 443. alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:il- Specifies the Amazon Resource Name (ARN) of the SSL/TLS certificate to use for HTTPS traffic. The ARN points to a certificate stored in AWS Certificate Manager (ACM), which will be used to terminate SSL/TLS connections at the ALB. These annotations provide fine-grained control over the behavior and configuration of the AWS Application Load Balancer when used as an ingress controller in a Kubernetes cluster. They allow you to customize the ALB's networking, security, and routing settings to suit your specific requirements. After configuring the NGINX with location settings to respond to the paths for our data servers, the final step was to extend this setup to include the SAM by defining its service and adding the route in the ingress file. Security Considerations: We meticulously aligned our approach with cloud security best practices, particularly the principle of least privilege, ensuring that only necessary access rights are granted to perform a task. DATA00: DATA01: SAM: Conclusion: This article shared our journey of migrating our application to the cloud using InterSystems IRIS on AWS EKS, focusing on creating a centralized, accessible, and secure management solution for the IRIS cluster. By leveraging security best practices and innovative approaches, we achieved a scalable and highly available architecture. We hope that the insights and techniques shared in this article prove valuable to those embarking on their own cloud migration projects with InterSystems IRIS. If you apply these concepts to your work, we'd be interested to learn about your experiences and any lessons you discover throughout the process I found this extremely useful, Thank you. Amazing architecture and sofisticated system. The use of InterSystem IRIS infrastructure is a result of hard word and complicated integration with the system. Thanks for sharing
Announcement
Anastasia Dyubaylo · Mar 28

[Video] Rapidly Create and Deploy Secure REST Services on InterSystems IRIS

Hi Community, Enjoy the new video on InterSystems Developers YouTube: ⏯ Rapidly Create and Deploy Secure REST Services on InterSystems IRIS @ Global Summit 2024 Learn about isc-rest, an open source package for defining REST APIs for CRUD operations, class queries, and business logic. We'll cover the basic concepts and patterns and talk about the success TriFour has had using it in their products. Presenters:🗣 @Timothy.Leavitt, Development Manager, InterSystems🗣 @Gerrit.Henning1669, CEO, TriFour🗣 @Stephan.duPlooy7271, CTO, TriFour Enjoy watching, and look forward to more videos!👍
Article
Andreas Schneider · Apr 22

Testing Metadata Inconsistencies in InterSystems IRIS Using the DATATYPE_SAMPLE Database

When using standard SQL or the object layer in InterSystems IRIS, metadata consistency is usually maintained through built-in validation and type enforcement. However, legacy systems that bypass these layers—directly accessing globals—can introduce subtle and serious inconsistencies. Understanding how drivers behave in these edge cases is crucial for diagnosing legacy data issues and ensuring application reliability.The DATATYPE_SAMPLE database is designed to help analyze error scenarios where column values do not conform to the data types or constraints defined in the metadata. The goal is to evaluate how InterSystems IRIS and its drivers (JDBC, ODBC, .NET) and different tools behave when such inconsistencies occur. In this post, I’ll focus on the JDBC driver. What's the Problem? Some legacy applications write directly to globals. If a relational model (created via CREATE TABLE or manually defined using a global mapping) is used to expose this data, the mapping defines the underlying values conform to the declared metadata for each column. When this assumption is broken, different types of problems may occur: Access Failure: A value cannot be read at all, and an exception is thrown when the driver tries to access it. Silent Corruption: The value is read successfully but does not match the expected metadata. Undetected Mutation: The value is read and appears valid, but was silently altered by the driver to fit the metadata, making the inconsistency hard to detect. Simulating the Behavior To demonstrate these scenarios, I created the DATATYPE_SAMPLE database, available on the InterSystems Open Exchange:🔗 Package page🔗 GitHub repo The table used for the demonstration: CREATE TABLE SQLUser.Employee ( ID BIGINT NOT NULL AUTO_INCREMENT, Age INTEGER, Company BIGINT, DOB DATE, FavoriteColors VARCHAR(4096), Name VARCHAR(50) NOT NULL, Notes LONGVARCHAR, Picture LONGVARBINARY, SSN VARCHAR(50) NOT NULL, Salary INTEGER, Spouse BIGINT, Title VARCHAR(50), Home_City VARCHAR(80), Home_State VARCHAR(2), Home_Street VARCHAR(80), Home_Zip VARCHAR(5), Office_City VARCHAR(80), Office_State VARCHAR(2), Office_Street VARCHAR(80), Office_Zip VARCHAR(5) ); Example 1: Access Failure To simulate an inconsistency, I injected invalid values into the DOB (Date of Birth\Datatype DATE) column using direct global access. Specifically, the rows with primary keys 101, 180, 181, 182, 183, 184, and 185 were populated with values that do not represent valid dates.The values looks like this now: As you can see, a string was appended to the end of a $H (Horolog) value. According to the table's metadata, this column is expected to contain a date—but the stored value clearly isn't one. So what happens when you try to read this data? Well, it depends on the tool you're using. I tested a few different tools to compare how they handle this kind of inconsistency. 1) SquirrelSQL (SQuirreL SQL Client Home Page)When SquirrelSQL attempts to access the data, an error occurs. It tries to read all rows and columns, and any cell that contains invalid data is simply marked as "ERROR". Unfortunately, I couldn't find any additional details or error messages explaining the cause. 2) SQLWorkbench/J (SQL Workbench/J - Home)SQL Workbench/J stops processing the result set as soon as it encounters the first invalid cell. It displays an error message like "Invalid date", but unfortunately, it doesn't provide any information about which row caused the issue. 3) DBVisualizer (dbvis) & DBeaver (dbeaver) DBVisualizer and DBeaver behave similarly. Both tools continue reading the result set and provide detailed error messages for each affected cell. This makes it easy to identify the corresponding row that caused the issue. 4) SQL DATA LENS (SQL Data Lens - a powerful tool for InterSystems IRIS and Caché) With the latest release of SQL DATA LENS, you get detailed information about the error, the affected row, and the actual database value. As shown in the screenshot, the internal value for the first row in columns DOB is "39146<Ruined>", which cannot be cast to a valid DATE. SQL DATA LENS also allows you to configure whether result processing should stop at the first erroneous cell or continue reading to retrieve all available data. The next part of this article will shows details about: Silent Corruption: The value is read successfully but does not match the expected metadata. Undetected Mutation: The value is read and appears valid, but was silently altered by the driver to fit the metadata, making the inconsistency hard to detect. Andreas
Article
Eduard Lebedyuk · May 14, 2018

Continuous Delivery of your InterSystems solution using GitLab - Index

In this series of articles, I'd like to present and discuss several possible approaches toward software development with InterSystems technologies and GitLab. I will cover such topics as: First article Git basics, why a high-level understanding of Git concepts is important for modern software development, How Git can be used to develop software (Git flows) Second article GitLab Workflow - a complete software life cycle process - from idea to user feedback Continuous Delivery - software engineering approach in which teams produce software in short cycles, ensuring that the software can be reliably released at any time. It aims at building, testing, and releasing software faster and more frequently. Third article GitLab installation and configuration Connecting your environments to GitLab Fourth article Continuous delivery configuration Fifth article Containers and how (and why) they can be used. Sixth article Main components for a continuous delivery pipeline with containers How they all work together. Seventh article Continuous delivery configuration with containers Eighth article Continuous delivery configuration with InterSystems Cloud Manager Ninth article Container architecture Tenth article CI/CD for Configuration and Data Eleventh article Interoperability and CI/CD Twelfth article Dynamic Inactivity Timeouts In this series of articles, I covered general approaches to the Continuous Delivery. It is an extremely broad topic and this series of articles should be seen more as a collection of recipes rather than something definitive. If you want to automate building, testing and delivery of your application Continuous Delivery in general and GitLab in particular is the way to go. Continuous Delivery and containers allows you to customize your workflow as you need it.
Article
sween · Nov 7, 2019

Export InterSystems IRIS Data to BigQuery on Google Cloud Platform

Loading your IRIS Data to your Google Cloud Big Query Data Warehouse and keeping it current can be a hassle with bulky Commercial Third Party Off The Shelf ETL platforms, but made dead simple using the iris2bq utility. Let's say IRIS is contributing to workload for a Hospital system, routing DICOM images, ingesting HL7 messages, posting FHIR resources, or pushing CCDA's to next provider in a transition of care. Natively, IRIS persists these objects in various stages of the pipeline via the nature of the business processes and anything you included along the way. Lets send that up to Google Big Query to augment and compliment the rest of our Data Warehouse data and ETL (Extract Transform Load) or ELT (Extract Load Transform) to our hearts desire. A reference architecture diagram may be worth a thousand words, but 3 bullet points may work out a little bit better: It exports the data from IRIS into DataFrames It saves them into GCS as .avro to keep the schema along the data: this will avoid to specify/create the BigQuery table schema beforehands. It starts BigQuery jobs to import those .avro into the respective BigQuery tables you specify. Under the hood, iris2bq it is using the Spark framework for the sake of simplicity, but no Hadoop cluster is needed. It is configured as a "local" cluster by default, meaning the application and is running standalone. The tool is meant to be launched on an interval either through cron or something like Airflow. All you have to do is point it at your IRIS instance, tell it what tables you want to sync to Big Query, then they magically sync to an existing dataset or a creates a new one that you specify. How To Setup And if a reference architecture and 3 bullet points didn't do a good job explaining it, maybe actually running it will: Google Cloud Setup You can do this anyway you want, here are a few options for you, but all you have to do in GCP is: Create a Project Enable the API's of Big Query and Cloud Storage Create a service account with access to create resources and download the json file. Using the Google Cloud Console (Easiest) https://cloud.google.com Using gcloud (Impress Your Friends): gcloud projects create iris2bq-demo--enable-cloud-apis With Terraform (Coolest): Create a main.tf file after modifying the values: // Create the GCP Project resource "google_project" "gcp_project" { name = "IRIS 2 Big Query Demo" project_id = "iris2bq-demo" // You'll need this org_id = "1234567" } // Enable the APIS resource "google_project_services" "gcp_project_apis" { project = "iris2bq-demo" services = ["bigquery.googleapis.com", "storage.googleapis.com"] } Then do a: terraform init terraform plan terraform apply IRIS Setup Lets quickly jam some data into IRIS for a demonostration. Create a class like so: Class User.People Extends (%Persistent, %Populate) { Property ID As %String; Property FirstName As %String(POPSPEC = "NAME"); Property LastName As %String(POPSPEC = "NAME"); } Then run the populate to generate some data. USER>do ##class(User.People).Populate(10000) Alternatively, you can grab an irissession, ensure you are in the USER namespace and run the following commands. USER> SET result=$SYSTEM.SQL.Execute("CREATE TABLE People(ID int, FirstName varchar(255), LastName varchar(255))") USER> for i=1:1:100000 { SET result=$SYSTEM.SQL.Execute("INSERT INTO People VALUES ("_i_", 'First"_i_"', 'Last"_i_"')") } Both routes will create a table called "People" and insert 100,000 rows. Either way you to and if everything worked out, you should be able to query for some dummy rows in IRIS. These are the rows we are sending to Big Query. IRIS2BQ Setup Download the latest release of the utility iris2bq, and unzip it. Then cd to the `bin` directory and move over your credentials to the root of this directory and create an application.conf file as below into the same root. Taking a look at the below configuration file here, you can get an idea of how the utility works. Specify a jdbc url and the credentials for the system user. Give it a list of tables that you wan to appear in Big Query. Tell the utility which project to point to, the location of your credentials file. Then tell it a target Big Query Dataset, and a target bucket to write the .avro files to. Quick note on the GCP block, the dataset and bucket can either exist or not exist as the utility will create those resources for you. jdbc { url = "jdbc:IRIS://127.0.0.1:51773/USER" user = "_SYSTEM" password = "flounder" // the password is flounder tables = [ "people" ] //IRIS tables to send to big query } gcloud { project = "iris2bq-demo" service-account-key-path = "service.key.json" //gcp service account bq.dataset = "iris2bqdemods" // target bq dataset gcs.tmp-bucket = "iris2bqdemobucket" //target storage bucket } Run At this point we should be parked at our command prompt in the root of the utility, with a conf file we created and the json credentials file. Now that we have all that in place, lets run it and check the result. $ export GOOGLE_CLOUD_PROJECT=iris2bq-demo $ exportGOOGLE_APPLICATION_CREDENTIALS=service.key.json $./iris2bq -Dconfig.file=configuration.conf The output is a tad chatty, but if the import was successful it will state `people import done!` Lets head over to to Big Query and inspect our work... The baseNUBE team hopes you found this helpful! Now setup a job to run it on an interval and JOIN all over your IRIS data in Big Query! The article is considered as InterSystems Data Platform Best Practice. Hi Ron, thanks for this great article. There's a typo which creates a wondering question about the potentiality of Google Cloud : Using the Google Cloud Console (Easiest) https://could.google.com fixed, thank you!
Article
sween · Jun 7, 2023

DeDupe an InterSystems® FHIR® Server With FHIR SQL Builder and Zingg

This post backs the demonstration at Global Summit 2023 "Demos and Drinks" with details most likely lost in the noise of the event. This is a demonstration on how to use the FHIR SQL Capabilities of InterSystems FHIR Server along side the Super Awesome Identity and Resolution Solution, Zingg.ai to detect duplicate records in your FHIR repository, and the basic idea behind remediation of those resources with the under construction PID^TOO|| currently enrolled in the InterSystems Incubator program. If you are into the "Compostable CDP" movement and want to master your FHIR Repository in place you may be in the right spot. Demo FHIR SQL Builder This is an easy 3 step process for FHIR SQL. Set up an Analysis Set up a Transform Set up a Projection Zingg.ai The documentation for Zingg is off the chain exhaustive and though extensible and scalable beyond this simple demo, here are the basics. Find Label Train Match Link Let's get on with it as they say... FHIR SQL We were awarded a full trial of the Health Connect Cloud suite for the duration of the incubator, included in that was the FHIR Server with FHIR SQL Enabled. The FHIR SQL Builder (abbreviated “builder”) is a sophisticated projection tool to help developers create custom SQL schemas using data in their FHIR (Fast Healthcare Interoperability Resources) repository without moving the data to a separate SQL repository. The objective of the builder is to enable data analysts and business intelligence developers to work with FHIR using familiar analytic tools, such as ANSI SQL, Power BI, or Tableau, without having to learn a new query syntax. I know right? Super great, now our "builder" is going to go to work projecting the fields weed need to identify the duplicate the records with Zingg. First step is the analysis, which you can hardly go wrong clicking this one into submission, simply point the analysis to "localhost" which is essentially the InterSystems FHIR Server underneath. The transforms are the critical piece to get right, you need to build these fields out to flatten the FHIR resources so they can be read to make some decisions over the SQL super server. These most likely will need some thought as the more fields you transform to sql, the better your machine learning model will end up when prospecting for duplicates. Now, a data steward type should typically be building these, but I whooped up a few for the demo with questionable simplicity, this was done by using the "clown suit" by hitting the pencil to generate them, but considering the sophistication that may go into them, you can import and export them as well. Pay special attention to the "Package" and "Name" as this will be the source table for your sql connection. This transform is essentially saying we want to use name and gender to detect duplicates. Patient Transform example { "name": "PIDTOO Patient", "description": "Patients for PIDTOO FHIR Dedupping Engine", "resources": [ { "resourceType": "Patient", "columns": [ { "name": "NameFamily", "type": "String", "path": "Patient.name.family", "index": false }, { "name": "NameGiven", "type": "String", "path": "Patient.name.given", "index": false }, { "name": "Gender", "type": "String", "path": "Patient.gender", "index": false }, { "name": "AddressPostalCode", "type": "String", "path": "Patient.address.postalCode", "index": false }, { "name": "IdentifierValue", "type": "String", "path": "Patient.identifier.value", "index": false } ] } ] } Now, the last part is essentially scheduling the job to project the data to a target schema. I think you will be presently surprised that as data is added to the FHIR Server, the projections fill automatically, phew. Now to seal the deal with the setup of FHIR SQL, you can see the projection for Patient (PIDTOO.Patient) being visible, then you create another table to store the output from the dedupe run (PIDTOO.PatientDups). Another step you will need to complete is enabling the Firewall so that external connections are enabled for your deployment, and you have allowed access for the source CIDR block connecting to the super server. Mentally bookmark the overview page, as it has the connectivity information and credentials needed to connect before moving to the next step. DeDupe with Zingg Zingg is super powerful, OSS, runs on Spark and scales to your wallet when it comes to de-duplication of datasets large and small. I don't want to oversimplify the task, but the reality is the documentation, functional container are enough to get up and running very quickly. We will keep this to the brass tacks though to minimally point out what needs to be completed to execute your first de-duplication job fired off against an IRIS Database. Install Clone the zingg.ai repo: https://github.com/zinggAI/zingg We also need the JDBC driver for Zingg to connect with IRIS. Download the IRIS JDBC driver and add the path of the driver to spark.jars property of zingg.conf... organize this jar in the `thirdParty/lib` directory. spark.jars=/home/sween/Desktop/PIDTOO/api/zingg/thirdParty/lib/intersystems-jdbc-3.7.1.jar Match Criteria This step takes some thought and can be fun if you are into this stuff or enjoy listening to black box recordings of plane crashes on YouTube. To demonstrate things, recall the fields we projected from "builder" to establish the match criteria. All of this is done in our python implementation that declares the PySpark job, which will be revealed in its entirety down a ways. # FHIRSQL Source Object FIELDDEFS # Pro Tip! # These Fields are included in FHIRSQL Projections, but not specified in the Transform fhirkey = FieldDefinition("Key", "string", MatchType.DONT_USE) dbid = FieldDefinition("ID", "string", MatchType.DONT_USE) # Actual Fields from the Projection srcid = FieldDefinition("IdentifierValue", "string", MatchType.DONT_USE) given = FieldDefinition("NameFamily", "string", MatchType.FUZZY) family = FieldDefinition("NameGiven", "string", MatchType.FUZZY) zip = FieldDefinition("AddressPostalCode", "string", MatchType.ONLY_ALPHABETS_FUZZY) gender = FieldDefinition("Gender", "string", MatchType.FUZZY) fieldDefs = [fhirkey, dbid, srcid, given, family,zip, gender] So the fields match the attributes in our IRIS project and the MatchTypes we set for each field type. You'll be delighted with what is available as you can immediately put them to good use with clear understanding. Three common ones are here: FUZZY: Generalized matching with strings and stuff EXACT: No variations allowed, deterministic value, guards against domain conflicts sorta. DONT_USE: Fields that have nothing to do with the matching, but needed in the remediation or understanding in the output. Some other favorites of mine are here, as they seem to work on dirty data a little bit better and make sense of multiple emails. EMAIL: Hacks off the domain name and the @, and uses the string. TEXT: Things between two strings ONLY_ALPHABETS_FUZZY: Omits integers and non-alphas where they clearly do not belong for match consideration The full list is available here for the curious. Model Create a folder to build your model... this one follows the standard in the repo, create folder `models/700`. # Object MODEL args = Arguments() args.setFieldDefinition(fieldDefs) args.setModelId("700") args.setZinggDir("/home/sween/Desktop/PIDTOO/api/zingg/models") args.setNumPartitions(4) args.setLabelDataSampleSize(0.5) Input These values are represented in what we setup in the previous steps on "builder" # "builder" Projected Object FIELDDEFS InterSystemsFHIRSQL = Pipe("InterSystemsFHIRSQL", "jdbc") InterSystemsFHIRSQL.addProperty("url","jdbc:IRIS://3.131.15.187:1972/FHIRDB") InterSystemsFHIRSQL.addProperty("dbtable", "PIDTOO.Patient") InterSystemsFHIRSQL.addProperty("driver", "com.intersystems.jdbc.IRISDriver") InterSystemsFHIRSQL.addProperty("user","fhirsql") # Use the same password that is on your luggage InterSystemsFHIRSQL.addProperty("password","1234") args.setData(InterSystemsFHIRSQL) Output Now this table is not a projected table by "builder", it is an empty table we created to house the results from Zingg. # Zingg's Destination Object on IRIS InterSystemsIRIS = Pipe("InterSystemsIRIS", "jdbc") InterSystemsIRIS.addProperty("url","jdbc:IRIS://3.131.15.187:1972/FHIRDB") InterSystemsIRIS.addProperty("dbtable", "PIDTOO.PatientDups") InterSystemsIRIS.addProperty("driver", "com.intersystems.jdbc.IRISDriver") InterSystemsIRIS.addProperty("user","fhirsql") # Please use the same password as your luggage InterSystemsIRIS.addProperty("password","1234") args.setOutput(InterSystemsIRIS) If you are trying to understand the flow here, hopefully this will clarify things. 1. Zingg reads the projected data from builder (PIDTOO.Patient)2. We do some "ML Shampoo" against the data.3. Then we write the results back to builder (PIDTOO.PatientDups) Thanks to @Sergei.Shutov3787 for the icons! ML Shampoo Now, Zingg is a supervised machine learning implementation, so you are going to have to train it up front, and at an interval to keep the model smart. Its the "rinse and repeat" part of the analogy if you havent gotten the shampoo reference from above. Find - Go get some data Label - Prompt the human to help us out Train - Once we have enough labelled data Match - Zingg writes out the results Link bash scripts/zingg.sh --properties-file config/zingg-iris.conf --run pidtoo-iris/FHIRPatient-IRIS.py findTrainingData bash scripts/zingg.sh --properties-file config/zingg-iris.conf --run pidtoo-iris/FHIRPatient-IRIS.py label bash scripts/zingg.sh --properties-file config/zingg-iris.conf --run pidtoo-iris/FHIRPatient-IRIS.py train For the find, you will get something a little bit like the below if things are working correctly. findTrainingData 2023-06-07 16:20:03,677 [Thread-6] INFO zingg.ZinggBase - Start reading internal configurations and functions 2023-06-07 16:20:03,690 [Thread-6] INFO zingg.ZinggBase - Finished reading internal configurations and functions 2023-06-07 16:20:03,697 [Thread-6] WARN zingg.util.PipeUtil - Reading input jdbc 2023-06-07 16:20:03,697 [Thread-6] WARN zingg.util.PipeUtil - Reading Pipe [name=InterSystemsFHIRSQL, format=jdbc, preprocessors=null, props={password=1234luggage, driver=com.intersystems.jdbc.IRISDriver, dbtable=PIDTOO.Patient, user=fhirsql, url=jdbc:IRIS://3.131.15.187:1972/FHIRDB}, schema=null] 2023-06-07 16:20:38,708 [Thread-6] WARN zingg.TrainingDataFinder - Read input data 71383 2023-06-07 16:20:38,709 [Thread-6] WARN zingg.util.PipeUtil - Reading input parquet 2023-06-07 16:20:38,710 [Thread-6] WARN zingg.util.PipeUtil - Reading Pipe [name=null, format=parquet, preprocessors=null, props={location=/home/sween/Desktop/PIDTOO/api/zingg/models/700/trainingData//marked/}, schema=null] 2023-06-07 16:20:39,130 [Thread-6] WARN zingg.util.DSUtil - Read marked training samples 2023-06-07 16:20:39,139 [Thread-6] WARN zingg.util.DSUtil - No configured training samples 2023-06-07 16:20:39,752 [Thread-6] WARN zingg.TrainingDataFinder - Read training samples 37 neg 64 2023-06-07 16:20:39,946 [Thread-6] INFO zingg.TrainingDataFinder - Preprocessing DS for stopWords 2023-06-07 16:20:40,275 [Thread-6] INFO zingg.util.Heuristics - **Block size **35 and total count was 35695 2023-06-07 16:20:40,276 [Thread-6] INFO zingg.util.Heuristics - Heuristics suggest 35 2023-06-07 16:20:40,276 [Thread-6] INFO zingg.util.BlockingTreeUtil - Learning indexing rules for block size 35 2023-06-07 16:20:40,728 [Thread-6] WARN org.apache.spark.sql.execution.CacheManager - Asked to cache already cached data. 2023-06-07 16:20:40,924 [Thread-6] INFO zingg.util.ModelUtil - Learning similarity rules 2023-06-07 16:20:41,072 [Thread-6] WARN org.apache.spark.sql.catalyst.util.package - Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'. 2023-06-07 16:20:41,171 [Thread-6] INFO org.apache.spark.ml.util.Instrumentation - [06bfeecf] Stage class: LogisticRegression 2023-06-07 16:20:41,171 [Thread-6] INFO org.apache.spark.ml.util.Instrumentation - [06bfeecf] Stage uid: logreg_d240511c93be 2023-06-07 16:20:41,388 [Thread-6] INFO org.apache.spark.ml.util.Instrumentation - [06bfeecf] training: numPartitions=1 storageLevel=StorageLevel(1 replicas) 2023-06-07 16:20:41,390 [Thread-6] INFO org.apache.spark.ml.util.Instrumentation - [06bfeecf] {"featuresCol":"z_feature","fitIntercept":true,"labelCol":"z_isMatch","predictionCol":"z_prediction","probabilityCol":"z_probability","maxIter":100} 2023-06-07 16:20:41,752 [Thread-6] INFO org.apache.spark.ml.util.Instrumentation - [06bfeecf] {"numClasses":2} 2023-06-07 16:20:41,752 [Thread-6] INFO org.apache.spark.ml.util.Instrumentation - [06bfeecf] {"numFeatures":164} 2023-06-07 16:20:41,752 [Thread-6] INFO org.apache.spark.ml.util.Instrumentation - [06bfeecf] {"numExamples":101} 2023-06-07 16:20:41,753 [Thread-6] INFO org.apache.spark.ml.util.Instrumentation - [06bfeecf] {"lowestLabelWeight":"37.0"} 2023-06-07 16:20:41,753 [Thread-6] INFO org.apache.spark.ml.util.Instrumentation - [06bfeecf] {"highestLabelWeight":"64.0"} 2023-06-07 16:20:41,755 [Thread-6] INFO org.apache.spark.ml.util.Instrumentation - [06bfeecf] {"sumOfWeights":101.0} 2023-06-07 16:20:41,756 [Thread-6] INFO org.apache.spark.ml.util.Instrumentation - [06bfeecf] {"actualBlockSizeInMB":"1.0"} 2023-06-07 16:20:42,149 [Executor task launch worker for task 0.0 in stage 29.0 (TID 111)] WARN com.github.fommil.netlib.BLAS - Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS 2023-06-07 16:20:42,149 [Executor task launch worker for task 0.0 in stage 29.0 (TID 111)] WARN com.github.fommil.netlib.BLAS - Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS 2023-06-07 16:20:44,470 [Thread-6] INFO org.apache.spark.ml.util.Instrumentation - [5a6fd183] training finished 2023-06-07 16:20:44,470 [Thread-6] INFO zingg.model.Model - threshold while predicting is 0.5 2023-06-07 16:20:44,589 [Thread-6] INFO org.apache.spark.ml.util.Instrumentation - [aa3d8dc3] training finished 2023-06-07 16:20:44,600 [Thread-6] INFO zingg.TrainingDataFinder - Writing uncertain pairs 2023-06-07 16:20:47,788 [Thread-6] WARN zingg.util.PipeUtil - Writing output Pipe [name=null, format=parquet, preprocessors=null, props={location=/home/sween/Desktop/PIDTOO/api/zingg/models/700/trainingData//unmarked/}, schema=null] Now, we train the Cylon with supervised learning, lets give it a go. Label 2023-06-07 16:24:06,122 [Thread-6] INFO zingg.Labeller - Processing Records for CLI Labelling Labelled pairs so far : 37/101 MATCH, 64/101 DO NOT MATCH, 0/101 NOT SURE Current labelling round : 0/20 pairs labelled +----------------+------+---------------+----------+---------+-----------------+------+-------------------+ |Key |ID |IdentifierValue|NameFamily|NameGiven|AddressPostalCode|Gender|z_source | +----------------+------+---------------+----------+---------+-----------------+------+-------------------+ |Patient/05941921|303302|null |davis |derek |28251 |male |InterSystemsFHIRSQL| |Patient/05869254|263195|null |davis |terek |27|07 |male |InterSystemsFHIRSQL| +----------------+------+---------------+----------+---------+-----------------+------+-------------------+ Zingg predicts the above records MATCH with a similarity score of 0.51 What do you think? Your choices are: No, they do not match : 0 Yes, they match : 1 Not sure : 2 To exit : 9 Please enter your choice [0,1,2 or 9]: Now, do what the cylon says, and do this a lot, maybe during meetings or on the Red Line on your way or heading home from work (Get it? Train). You'll need enough labels for the train phase, where Zingg goes to town and works its magic finding duplicates. bash scripts/zingg.sh --properties-file config/zingg-iris.conf --run pidtoo-iris/FHIRPatient-IRIS.py train Ok, here we go, lets get our results: bash scripts/zingg.sh --properties-file config/zingg-iris.conf --run pidtoo-iris/FHIRPatient-IRIS.py match We now have some results back in the PIDTOO.PatientDups table that gets us to the point of things. We are going to use @Dmitry.Maslennikov 's sqlalchemy sorcery to connect via the notebook and inspect our results. from sqlalchemy import create_engine # FHIRSQL Builder Cloud Instance engine = create_engine("iris://fhirsql:1234@3.131.15.187:1972/FHIRDB") conn = engine.connect() query = ''' SELECT TOP 20 z_cluster, z_maxScore, z_minScore, NameGiven, NameFamily, COUNT(*) FROM PIDTOO.PatientDups GROUP BY z_cluster HAVING COUNT(*) > 1 ''' result = conn.exec_driver_sql(query) print(result) It takes a little bit to interpret the results, but, here is the result of the brief training on loading the NC voters data into FHIR. loadncvoters2fhir.py © import os import requests import json import csv ''' recid,givenname,surname,suburb,postcode 07610568,ranty,turner,statesvikle,28625 ''' for filename in os.listdir("."): print(filename) if filename.startswith("ncvr"): with open(filename, newline='') as csvfile: ncreader = csv.reader(csvfile, delimiter=',') for row in ncreader: patid = row[0] given = row[1] family = row[2] postcode = row[4] patientpayload = { "resourceType": "Patient", "id": patid, "active": True, "name": [ { "use": "official", "family": family, "given": [ given ] } ], "gender": "male", "address": [ { "postalCode": postcode } ] } print(patientpayload) url = "https://fhir.h7kp7tr48ilp.workload-nonprod-fhiraas.isccloud.io/Patient/" + patid headers = { 'x-api-key': '1234', 'Content-Type': 'application/fhir+json' } response = requests.request("PUT", url, headers=headers, data=json.dumps(patientpayload)) print(response.status_code) The output Zingg gave us is pretty great for the minimal effort I put in training things in between gas lighting. z_cluster is the id Zingg assigns to the duplicates, I call it the "dupeid", just understand that is the identifier of the you want to query to examine the potential duplicates... Im accustomed to trusting a minScore of 0.00 and anything over 0.90 for a score for examination. (189, 0.4677305247393828, 0.4677305247393828, 'latonya', 'beatty', 2) (316, 0.8877195988867068, 0.7148998161578, 'wiloiam', 'adams', 5) (321, 0.5646965557084127, 0.0, 'mar9aret', 'bridges', 3) (326, 0.5707960437038071, 0.0, 'donnm', 'johnson', 6) (328, 0.982044685998597, 0.40717509762282955, 'christina', 'davis', 4) (333, 0.8879795543643093, 0.8879795543643093, 'tiffany', 'stamprr', 2) (334, 0.808243240184001, 0.0, 'amanta', 'hall', 4) (343, 0.6544295790716498, 0.0, 'margared', 'casey', 3) (355, 0.7028336885619522, 0.7028336885619522, 'dammie', 'locklear', 2) (357, 0.509141927875999, 0.509141927875999, 'albert', 'hardisfon', 2) (362, 0.5054569794103886, 0.0, 'zarah', 'hll', 6) (366, 0.4864567456390275, 0.4238040425261962, 'cara', 'matthews', 4) (367, 0.5210329255531461, 0.5210329255531461, 'william', 'metcaif', 2) (368, 0.6431091575056218, 0.6431091575056218, 'charles', 'sbarpe', 2) (385, 0.5338624802449684, 0.0, 'marc', 'moodt', 3) (393, 0.5640435106505274, 0.5640435106505274, 'marla', 'millrr', 2) (403, 0.4687497402769476, 0.0, 'donsna', 'barnes', 3) (407, 0.5801171648347092, 0.0, 'veronicc', 'collins', 35) (410, 0.9543673811569922, 0.0, 'ann', 'mason', 7) (414, 0.5355771790403805, 0.5355771790403805, 'serry', 'mccaray', 2) Let's pick the "dupeid" 410 and see how we did, the results seem to think there are 7 duplicates. Ok, so there are the 7 records, with variable scores... Lets dial it in a little bit more and only report back a score of higher than .90. Wooo! So now, if you recall, we have the `MatchType.DONT_USE` for `Key` in our match criteria showing up in our output, but you know what? USE IT! https://fhir.h7kp7tr48ilp.workload-nonprod-fhiraas.isccloud.io/Patient/04892325 https://fhir.h7kp7tr48ilp.workload-nonprod-fhiraas.isccloud.io/Patient/02049329 These are the FHIR patient resource ids in the FHIR repository we have identified as duplicates and require remediation.🔥
Article
Mark Bolinsky · Jul 1, 2016

InterSystems Example Reference Architecture for Microsoft Azure Resource Manager (ARM)

++Update: August 2, 2018 This article provides a reference architecture as a sample for providing robust performing and highly available applications based on InterSystems Technologies that are applicable to Caché, Ensemble, HealthShare, TrakCare, and associated embedded technologies such as DeepSee, iKnow, Zen and Zen Mojo. Azure has two different deployment models for creating and working with resources: Azure Classic and Azure Resource Manager. The information detailed in this article is based on the Azure Resource Manager model (ARM). Summary Microsoft Azure cloud platform provides a feature rich environment for Infrastructure-as-a-Service (IaaS) as a cloud offering fully capable of supporting all of InterSystems products. Care must be taken, as with any platform or deployment model, to ensure all aspects of an environment are considered such as performance, availability, operations, and management procedures. Specifics of each of those areas will be covered in this article. Performance Within Azure ARM there are several options available for compute virtual machines (VMs) and associated storage options, and the most directly related to InterSystems products are network attached IaaS disks stored as VHD files in Azure page blob storage. There are several other options such as Blob (block), File and others, however those are more specific to an individual application’s requirements rather than supporting the operations of Caché. There are two types of storage where the disks are stored: Premium and Standard. Premium storage is more suited for production workloads that require guaranteed predictable low-latency Input/Output Operations per Second (IOPs) and throughput. Standard storage is a more economical option for non-production or archive type workloads. Care must be taken when selecting a particular VM type because not all VM types can have access to premium storage. Virtual IP Address and Automatic Failover Most IaaS cloud providers lacked the ability to provide for a Virtual IP (VIP) address that is typically used in database failover designs. To address this, several of the most commonly used connectivity methods, specifically ECP clients and CSP Gateways, have been enhanced within Caché to no longer rely on VIP capabilities making them mirror-aware. Connectivity methods such as xDBC, direct TCP/IP sockets, or other direct connect protocols, require the use of a VIP. To address those, InterSystems database mirroring technology makes it possible to provide automatic failover for those connectivity methods within Azure using APIs to interact with the Azure Internal Load Balancer (ILB) to achieve VIP-like functionality, thus providing a complete and robust high availability design within Azure. Details of this can be found in the Community article Database Mirroring without a Virtual IP address. Backup Operations Performing a backup using either traditional file-level or snapshot based backups can be a challenge in cloud deployments. This can now be achieved within Azure ARM platform using Azure Backup and Azure Automation Run Books along with InterSystems External Freeze and Thaw API capabilities to allow for true 24x7 operational resiliency and assurance of clean regular backups. Alternatively, many of the third-party backup tools available on the market can be used by deploying backup agents within the VM itself and leveraging file-level backups in conjunction with Logical Volume Manager (LVM) snapshots. Example Architecture As part of this document, a sample Azure architecture is provided as a starting point for your application specific deployment, and can be used as a guideline for numerous deployment possibilities. This reference architecture demonstrates a highly robust Caché database deployment including database mirror members for high availability, application servers using InterSystems Enterprise Cache Protocol (ECP), web servers with InterSystems CSP Gateway, and both internal and external Azure load balancers. Azure Architecture Deploying any Caché based application on Microsoft Azure requires some specific considerations in certain areas. The section discusses these areas that need to be considered in addition to any regular technical requirements you may have for your application. Two examples are being provided in this document one based on InterSystems TrakCare unified healthcare information system, and another option based on a complete InterSystems HealthShare health informatics platform deployment including: Information Exchange, Patient Index, Health Insight, Personal Community, and Health Connect. Virtual Machines Azure virtual machines (VMs) are available in two tiers: basic and standard. Both types offer a choice of sizes. The basic tier does not provide some capabilities available in the standard tier, such as load balancing and auto-scaling. For this reason, the standard tier is used for TrakCare deployments. Standard tier VMs come in various sizes grouped in different series, i.e. A, D, DS, F, FS, G, and GS. The DS, GS, and new FS sizes support the use of Azure Premium Storage. Production servers typically need to use Premium Storage for reliable, low-latency and high-performance. For this reason, the example TrakCare and HealthShare deployment architectures detailed in this document will be using either FS, DS or GS series VMs. Note that not all virtual machine sizes are available in all regions. For more details of sizes for virtual machines see: Windows Virtual Machine Sizes Linux Virtual Machine Sizes Storage Azure Premium Storage is required for TrakCare and HealthShare servers. Premium Storage stores data on Solid State Drives (SSDs) and provides high throughput at low latencies, whereas Standard Storage stores data on Hard Disk Drives (HDDs) resulting in lower performance levels. Azure Storage is a redundant and highly available system, however, it is important to notice that Availability Sets currently don’t provide redundancy across storage fault domains and in rare circumstances this can lead to issues. Microsoft has mitigation workarounds and is working on making this process widely available and easier to end-customers. It is advisable to work directly with your local Microsoft team to determine if any mitigation is required. When a disk is provisioned against a premium storage account, IOPS and throughput, (bandwidth) depends on the size of the disk. Currently, there are three types of premium storage disks: P10, P20, and P30. Each one has specific limits for IOPS and throughput as specified in the following table. Premium Disks Type P4 P6 P10 P15 P20 P30 P40 P50 Disk Size 32GB 64GB 128GB 256GB 512GB 1024GB 2048GB 4096GB IOPS per disk 120 240 500 1100 2300 5000 7500 7500 Throughput per disk 25MB/s 50MB/s 100MB/s 125MB/s 150MB/s 200MB/s 250MB/s 250MB/s Note: Ensure there is sufficient bandwidth available on a given VM to drive the disk traffic. For example, a STANDARD_DS13 VM has 256 MB per second dedicated bandwidth available for all premium storage disk traffic. That means four P30 premium storage disks attached to this VM have a throughput limit of 256 MB per second and not the 800 MB per second that four P30 disks could theoretically provide. For more details and limits on premium storage disks, including provisioned capacity, performance, sizes, IO sizes, Cache hits, throughput targets, and throttling see: Premium Storage High Availability InterSystems recommends having two or more virtual machines in a defined Availability Set. This configuration is required because during either a planned or unplanned maintenance event, at least one virtual machine will be available to meet the 99.95% Azure SLA. This is important because during data center updates, VMs are brought down in parallel, upgraded, and brought back online in no particular order leaving the application unavailable during this maintenance window. Therefore, a highly available architecture requires two of every server, i.e. load balanced web servers, database mirrors, multiple application servers and so on. For more information on Azure high availability best practices see: Managing Availability Web Server Load Balancing External and internal load balanced web servers may be required for your Caché based application. External load balancers are used for access over the Internet or WAN (VPN or Express Route) and internal load balancers are potentially used for internal traffic. The Azure load balancer is a Layer-4 (TCP, UDP) type load balancer that distributes incoming traffic among healthy service instances in cloud services or virtual machines defined in a load balancer set. The web server load balancers must be configured with client IP address session persistence (2 tuple) and the shortest probe timeout possible, which is currently 5 seconds. TrakCare requires session persistence for the period a user is logged in. The following diagram provided by Microsoft demonstrates a simple example of the Azure Load Balancer within an ARM deployment model. For more information on Azure load balancer features such as distribution algorithm, port forwarding, service monitoring, Source NAT, and different types of available load balancers see: Load Balancer Overview In addition to the Azure external load balancer, Azure provides the Azure Application Gateway. The Application Gateway is a L7 load balancer (HTTP/HTPS) with support for cookie-based session affinity and SSL termination (SSL offload). SSL offloading removes the encryption/decryption overhead from the Web servers, since the SSL connection is terminated at the load balancer. This approach simplifies management as the SSL certificate is deployed and managed in the getaway instead of all the nodes in the web farm. For more information, see: Application Gateway overview Configure an Application Gateway for SSL offload by using Azure Resource Manager Database Mirroring When deploying Caché based applications on Azure, providing high availability for the Caché database server requires the use of synchronous database mirroring to provide high availability in a given primary Azure region and potentially asynchronous database mirroring to replicate data to a hot standby in a secondary Azure region for disaster recovery depending on your uptime service level agreements requirements. A database mirror is a logical grouping of two database systems, known as failover members, which are physically independent systems connected only by a network. After arbitrating between the two systems, the mirror automatically designates one of them as the primary system; the other one automatically becomes the backup system. External client workstations or other computers connect to the mirror through the mirror Virtual IP (VIP), which is specified during mirroring configuration. The mirror VIP is automatically bound to an interface on the primary system of the mirror. Note: In Azure, it is not possible to configure the mirror VIP, so an alternative solution has been devised. The current recommendation for deploying a database mirror in Azure is to configure three VMs (primary, backup, arbiter) in the same Azure Availability Set. This ensures that at any given time, Azure will guarantee external connectivity with at least two of these VMs with a 99.95% SLA, and that each will be in different update and fault domains. This provides adequate isolation and redundancy of the database data itself. Additional details on can be found here: Azure Availability Sets Azure Server Level Agreements (SLAs) A challenge within any IaaS cloud provider, including Azure, is the handling of automatic failover of the client connections to the application with the absence of Virtual IP capabilities. To retain automatic failover for client connections a couple directions have been taken. Firstly, InterSystems has enhanced the CSP gateway to become mirror-aware so connectivity from a web server with the CSP Gateway to a database server no longer requires a VIP. The CSP gateway will auto-negotiate with both the of the mirror members and redirect to the appropriate member whichever is the primary mirror member. This goes along with the already mirror-aware capabilities of ECP clients if using them. Secondly, connectivity outside of the CSP Gateways and ECP clients still requires a VIP-like capability. InterSystems recommends the use of the polling method with the mirror_status.cxw health check status page detailed in the community article Database Mirroring without a Virtual IP address. The Azure Internal Load Balancer (ILB) will provide a single IP address as a VIP-like capability to direct all network traffic to the primary mirror member. The ILB will only distribute traffic to the primary mirror member. This method does not rely on polling, and allows for an immediate redirection upon any mirror member within a mirror configuration becoming the primary member. Polling may be used in conjunction with this method is some DR scenarios using Azure Traffic Manager. Backup and Restore There are multiple options available for backup operations. The following three options are viable for your Azure deployment with InterSystems products. The first two options incorporate a snapshot type procedure which involves suspending database writes to disk prior to create the snapshot and then resuming updates once the snapshot was successful. The following high-level steps are taken to create a clean backup using either of the snapshot methods: Pause writes to the database via database Freeze API call. Create snapshots of the OS + data disks. Resume Caché writes via database Thaw API call. Backup facility archives to backup location Additional steps such as integrity checks can be added on a periodic interval to ensure clean and consistent backup. The decision points on which option to use depends on the operational requirements and policies of your organization. InterSystems is available to discuss the various options in more detail. Azure Backup Backup operations can now be achieved within Azure ARM platform using Azure Backup and Azure Automation Runbooks along with InterSystems External Freeze and Thaw API capabilities to allow for true 24x7 operational resiliency and assurance of clean regular backups. Details for managing and automating Azure Backups can be found here. Logical Volume Manager Snapshots Alternatively, many of the third-party backup tools available on the market can be used by deploying individual backup agents within the VM itself and leveraging file-level backups in conjunction with Logical Volume Manager (LVM) snapshots. One of the major benefits to this model is having the ability to have file-level restores of either Windows or Linux based VMs. A couple of points to note with this solution, is since Azure and most other IaaS cloud providers do not provide tape media, all backup repositories are disk-based for short term archiving and have the ability to leverage blob or bucket type low cost storage for long-term retention (LTR). It is highly recommended if using this method to use a backup product that supports de-duplication technologies to make the most efficient use of disk-based backup repositories. Some examples of these backup products with cloud support include but is not limited to: Commvault, EMC Networker, HPE Data Protector, and Veritas Netbackup. InterSystems does not validate or endorses one product over the other. Caché Online Backup For small deployments the built-in Caché Online Backup facility is also a viable option as well. This InterSystems database online backup utility backs up data in database files by capturing all blocks in the databases then writes the output to a sequential file. This proprietary backup mechanism is designed to cause no downtime to users of the production system. In Azure, after the online backup has finished, the backup output file and all other files in use by the system must be copied to an Azure File share. This process needs to be scripted and executed within the virtual machine. The Azure File shares should use an Azure RA-GRS storage account for maximum availability. Note Azure File shares have a maximum share size of 5TB, a maximum file size of 1TB, and maximum 60 MB/s throughput per share (shared by all clients). Online backup is the entry-level approach for smaller sites wishing to implement a low cost solution for backup. However, as databases increase in size, external backups with snapshot technology are recommended as a best practice with advantages including the backup of external files, faster restore times, and an enterprise-wide view of data and management tools. Disaster Recovery When deploying a Caché based application on Azure, Disaster Recovery (DR) resources including network, servers and storage are recommended to be in different Azure region. The amount of capacity required in the designated DR Azure region depends on your organizational needs. In most cases 100% of the production capacity is required when operating in a DR mode, however lesser capacity can be provisioned until more is needed as an elastic model. Asynchronous database mirroring is used to continuously replicate to the DR Azure region’s virtual machines. Mirroring uses database transaction journals to replicate updates over a TCP/IP network in a way that has minimal performance impact on the primary system. Compression and encryption is highly recommended to be configured with these DR Asynchronous mirror members. All external clients on the general Internet who wish to access the application will be routed through an Azure Traffic Manager as a DNS service. Microsoft Azure Traffic Manager (ATM) is used as a switch to direct traffic to the current active data center. Azure Traffic Manager supports a number of algorithms to determine how end users are routed to the various service endpoints. Details of various algorithms can be found here. For the purpose of this document, the ‘priority’ traffic-routing method will be used in conjunction with Traffic Manager endpoint monitoring and failover. Details of endpoint monitoring and failover can be found here. Traffic Manager works by making regular requests to each endpoint and then verifying the response. If an endpoint fails to provide a valid response, Traffic Manager shows its status as Degraded. It is no longer included in DNS responses, which instead will return an alternative, available endpoint. In this way, user traffic is directed away from failing endpoints and toward endpoints that are available. Using the above methods, only the specific region and specific mirror member will only ever allow traffic to it. This is controlled by the endpoint definition which is a mirror_status page presented from the InterSystems CSP Gateway. Only the primary mirror member will ever report “success” as a HTTP 200 from the monitor probing. The following diagram provided by Microsoft demonstrates at a high-level the priority traffic-routine algorithm. The Azure Traffic Manager will yield a single endpoint such as: "https://my-app.trafficmanager.net" that all clients can connect to. In addition, an A record could be configured to provide a vanity URL such as "https://www.my-app-domain.com". The Azure Traffic Manager shall be configured with one profile that contains the addresses of both regions’ end point. At any given time, only one of the regions will report online based on the endpoint monitoring. This ensures that traffic only flows to one region at a given time. There are no added steps needed for failover between the regions since the endpoint monitoring will detect the application in the primary Azure region is down and the application is now live in the secondary Azure region. This is because the DR Async mirror member being promoted to primary and then allows the CSP Gateway to report HTTP 200 to the Traffic Manager endpoint monitoring. There are many alternatives to the above described solution, and can be customized based on your organization operational requirements and service level agreements. Network Connectivity Depending on your application’s connectivity requirements, there are multiple connectivity models using either Internet, IPSEC VPN, or a dedicated link using Azure Express Route are available. The method to choose will depend on the application and user needs. The bandwidth usage for each of the three methods vary, and best to check with your Azure representative or Azure Portal for confirmation of available connectivity options for a given region. If you are using Express Route, there are several options including multiple circuits and multi-region access that can be enabled for disaster recovery scenarios. It is important to work with the Express Route provider to understand the high availability and disaster recovery scenarios they support. Security Care needs to be taken when deciding to deploy an application in a public cloud provider. Your organization’s standard security policies, or new ones developed specifically for cloud, should be followed to maintain security compliance of your organization. Cloud deployments have the added risk of data now outside client data centers and physical security control. The use of InterSystems database and journal encryption for data at rest (databases and journals) and data in flight (network communications) with AES and SSL/TLS encryption respectively are highly recommended. As with all encryption key management, proper procedures need to be documented and followed per your organization’s policies to ensure data safety and prevent unwanted data access or security breech. When access is allowed over the Internet, third party firewall devices may be required for extra functionality such as intrusion detection, denial of service protection etc. Architecture Diagram Examples The diagrams below illustrates a typical Caché installation providing high availability in the form of database mirroring (both synchronous failover and DR Asynchronous), application servers using ECP, and multiple load balanced web servers. TrakCare Example The following diagram illustrates a typical TrakCare deployment with multiple load balanced webservers, two EPS print servers as ECP clients, and database mirror configuration. The Virtual IP address is only used for connectivity not associated with ECP or the CSP Gateway. The ECP clients and CSP Gateway are mirror-aware and do not require a VIP. The sample reference architecture diagram below includes high availability in the active or primary region, and disaster recovery to another Azure region if the primary Azure region is unavailable. Also within this example, the database mirrors contain the TrakCare DB, TrakCare Analytics, and Integration namespace all within that single mirror set. TrakCare Azure Reference Architecture Diagram - PHYSICAL ARCHITECTURE In addition, the following diagram is provided showing a more logical view of architecture with the associated high-level software products installed and functional purpose. TrakCare Azure Reference Architecture Diagram - LOGICAL ARCHITECTURE HealthShare Example The following diagram illustrates a typical HealthShare deployment with multiple load balanced webservers, with multiple HealthShare products including Information Exchange, Patient Index, Personal Community, Health Insight, and Health Connect. Each of those respective products include a database mirror pair for high availability within an Azure availability set. The Virtual IP address is only used for connectivity not associated with ECP or the CSP Gateway. The CSP Gateways used for web service communications between the HealthShare products are mirror-aware and do not require a VIP. The sample reference architecture diagram below includes high availability in the active or primary region, and disaster recovery to another Azure region if the primary Azure region is unavailable. HealthShare Azure Reference Architecture Diagram – PHYSICAL ARCHITECTURE In addition, the following diagram is provided showing a more logical view of architecture with the associated high-level software products installed, connectivity requirements and methods, and the respective functional purpose. HealthShare Azure Reference Architecture Diagram – LOGICAL ARCHITECTURE Given that the Azure pricing for storage contains a transaction element, is there any indication as to how many of these transactions will be consumed opening or saving an object as well as other common actions - obviously a simple object will use much less than a complex one. This is great Mark, excellent write up.Ran into a similar problem a couple of years ago on AWS with the mirror VIP, had a less sophisiticated solution with a custom business service on a target production/namespace listening for a keep alive socket the ELB to detect which Mirror Member was active.... re-used it for an auto-scaling group too for an indicator for availability we could put logic behind. Those links up there to the routines appears broke for me, would love to take a look at that magic.What's Azure's VPN for solution look like for site 2 site connections? The diagrams above maybe suggest this is possibly bolted to on-prem, but just curious if you had any comments to that with Azure.Did you provision a DNS Zone on a legible domain for internal communications? I abused a couple of *.info domains for this purpose and found that the hostnames enumerated from Cache were from the Instances and not very usable for interhost communication and broke things like Enterprise Manager, HS Endpoint Enumeration, etc.Does Azure have an Internet Gateway or a NAT solution to provide communication outbound from a single address (or fault tolerance) ? The diagram for Web Server Load Balancing looks like they work for both inbound and outbound just wondered if that was the case.Again, excellent resource, thanks for taking the time. Hi Matthew,Thank you for your question. Pricing is tricky and best discussed with your Microsoft representative. When looking at premium storage accounts, you only pay for the provisioned disk type not transactions, however there are caveats. For example if you need only 100GB of storage will be be charges for a P0 disk @ 128GB. A good Microsoft article to help explain the details can be found here.Regards,Mark B Hi Ron,There are many options available for may different deployment scenarios. Specifically for the multi-site VPN you can use the Azure VPN Gateway. Here is a diagram provided by Microsoft's documentation showing it. Here is the link as well to the multi-site VPN details.As for Internet gateways, yes they have that concept and the load balancers can be internal or external. You control access with network security groups and also using the Azure Traffic Manager and also using Azure DNS services. There are tons of options here and really up to you and what/how you want to control and manage the network. Here is a link to Azure's documentation about how to make a load balancer Internet facing.The link to the code for some reason wasn't marked as public in the github repository. I'll take care of that now.Regards,Mark B-
Article
Anton Umnikov · Feb 11, 2020

InterSystems IRIS Deployment Guide for AWS using CloudFormation template

InterSystems IRIS Deployment Guide for AWS using CloudFormation template Please note: following this guide, especially the prerequisites section requires Intermediate to Advanced level of knowledge of AWS. You'll need to create and manage S3 buckets, IAM roles for EC2 instances, VPCs and Subnets. You'll also need access to InterSystems binaries (usually downloaded via WRC site) as well as IRIS license key. Aug 12, 2020Anton Umnikov Templates Source code is available here: https://github.com/antonum/AWSIRISDeployment Table of Contents InterSystems IRIS Deployment Guide – AWS Partner Network. 1 Introduction. 3 Prerequisites and Requirements 3 Time. 3 Product License and Binaries. 3 AWS Account 3 IAM Entity for user 3 IAM Role for EC2. 4 S3 Bucket 4 VPC and Subnets 4 EC2 Key Pair 4 Knowledge Requirements 4 Architecture. 5 Multi-AZ Fault Tolerant Architecture Diagram (Preferred) 5 Single Instance, Single AZ Architecture Diagram (Development and Testing) 6 Deployment 7 Security. 7 Data in Private Subnets. 7 Encrypting IRIS Data at Rest 7 Encrypting IRIS data in transit 8 Secure access to IRIS Management Portal 8 Logging/Auditing/Monitoring. 8 Sizing/Cost 9 Deployment Assets. 10 Deployment Options. 10 Deployment Assets (Recommended for Production) 10 CloudFormation Template Input Parameters 10 Clean Up. 11 Testing the Deployment 11 Health Checks. 11 Failover Test 12 Backup and Recovery. 12 Backup. 12 Instance Failure. 12 Availability-Zone Failure. 12 Region Failure. 12 RPO/RTO.. 13 Storage Capacity. 13 Security certificate expiration. 14 Routine Maintenance. 14 Emergency Maintenance. 14 Support 15 Troubleshooting. 15 Contact InterSystems Support 16 Appendix. 16 IAM Policy for EC2 instance. 16 Introduction InterSystems provides the CloudFormation Template for users to set up their own InterSystems IRIS® data platform according to InterSystems and AWS best practices. This guide will detail the steps to deploy the CloudFormation template. In this guide, we cover two types of deployments for the InterSystems IRIS CloudFormation template. The first method is highly available using multiple availability zones (AZ) and targeted to production workloads, and the second method is a single availability zone deployment for development and testing workloads. Prerequisites and Requirements In this section, we detail the prerequisites and requirements to run and operate our solution. Time The deployment itself takes about 4 minutes, but with prerequisites and testing it could take up to 2 hours. Product License and Binaries InterSystems IRIS binaries are available to InterSystems customers via https://wrc.intersystems.com/. Login with your WRC credentials and follow the links to Actions -> SW Distributions -> InterSystems IRIS. This Deployment Guide is written for the Red Hat platform of InterSystems IRIS 2020.1 build 197. IRIS binaries file names are of the format ISCAgent-2020.1.0.215.0-lnxrhx64.tar.gz and IRISHealth-2020.1.0.217.1-lnxrhx64.tar.gz InterSystems IRIS license key – you should be able to use your existing InterSystems IRIS license key (iris.key). You can also request an evaluation key via the InterSystems IRIS Evaluation Service: https://download.intersystems.com/download/register.csp. AWS Account You must have an AWS account set up. If you do not, visit: https://aws.amazon.com/getting-started/ IAM Entity for user Create an IAM user or role. Your IAM user should have a policy that allows AWS CloudFormation actions. Do not use your root account to deploy the CloudFormation template. In addition to AWS CloudFormation actions, IAM users who create or delete stacks will also require additional permissions that depend on the stack template. This deployment requires permissions to all the services listed in the following section. Reference: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-iam-template.html. IAM Role for EC2 The CloudFormation template requires an IAM role that allows your EC2 instance to access S3 buckets and put logs into CloudWatch. See Appendix “IAM Policy for EC2 instance” for an example of the policy associated with such role. Reference: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html. S3 Bucket Create an S3 bucket called “my bucket”, copy IRIS binaries files and iris.key: BUCKET=<my bucket> aws s3 mb s3://$BUCKET aws s3 cp ISCAgent-2020.1.0.215.0-lnxrhx64.tar.gz s3://$BUCKET aws s3 cp IRISHealth-2020.1.0.217.1-lnxrhx64.tar.gz s3://$BUCKET aws s3 cp iris.key s3://$BUCKET VPC and Subnets The template is designed to deploy IRIS into an existing VPC and Subnets. In regions where three or more Availability Zones are available, we recommend creating three private subnets across three different AZ’s. Bastion Host should be located in any of the public subnets within the VPC. You can follow the AWS example to create a VPC and Subnets with the CloudFormation template: https://docs.aws.amazon.com/codebuild/latest/userguide/cloudformation-vpc-template.html. EC2 Key Pair To access the EC2 instances provisioned by this template, you will need at least one EC2 Key Pair. Refer to this guide for details: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html. Knowledge Requirements Knowledge of the following AWS services is required: Amazon Elastic Compute Cloud (Amazon EC2) Amazon Virtual Private Cloud (Amazon VPC) AWS CloudFormation AWS Elastic Load Balancing AWS S3 Account limit increases will not be required for this deployment. More information on proper policy and permissions can be found here: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-iam-template.html. Note: Individuals possessing the AWS Associate certifications should have a sufficient depth of knowledge. Architecture In this section, we give architecture diagrams of two deployment possibilities, and talk about architecture design choices. Multi-AZ Fault Tolerant Architecture Diagram (Preferred) In this preferred option, mirrored IRIS instances are situated behind a load balancer in two availability zones to ensure high availability and fault tolerance. In regions with three or more availability zones, the Arbiter node is located in the third AZ. Database nodes are located in private subnets. Bastion Host is in a Public subnet within the same VPC. Network Load Balancer directs database traffic to the current Primary IRIS node Bastion Host allows secure access to the IRIS EC2 instances IRIS stores all customer data in encrypted EBS volumes EBS is encrypted and uses the AWS Key Management Service (KMS) managed key For regulated workloads where encryption of data in transit is required, you can choose to use the r5n family of instances, since they provide automatic instance-to-instance traffic encryption. IRIS-level traffic encryption is also possible but not enabled by CloudFormation (see the Encrypting Data in Transit section of this guide) Use of security groups restrict access to the greatest degree possible by only allowing necessary traffic Single Instance, Single AZ Architecture Diagram (Development and Testing) InterSystems IRIS can also be deployed in a single Availability Zone for development and evaluation purposes. The data flow and architecture components are the same as the ones highlighted in the previous section. This solution does not provide high availability or fault tolerance, and is not suitable for production use. Deployment Log into your AWS account with the IAM entity created in the Prerequisites section with the required permissions to deploy the solution Make sure all the Prerequisites, such as VPC, S3 bucket, IRIS binaries and license key are in place Click the following link to deploy CloudFormation template (deploys in us-east-1): https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/new?stackName=InterSystemsIRIS&templateURL=https://isc-tech-validation.s3.amazonaws.com/MirrorCluster.yaml for multi-AZ, fault tolerant deployment In ‘Step 1 - Create Stack’, press the ‘Next’ button In ‘Step 2 - Specify stack details’, fill out and adjust CloudFormation parameters depending on your requirements Press the ‘Next’ button In ‘Step 3 - Configure stack options’, enter and adjust optional tags, permissions, and advanced options Press the ‘Next’ button Review your CloudFormation configurations Press the ‘Create Stack’ button Wait approximately 4 minutes for your CloudFormation template to deploy You can verify your deployment has succeeded by looking for a ‘CREATE_COMPLETE’ status If the status is ‘CREATE_FAILED’, see the troubleshooting section in this guide Once deployment succeeds, please carry out Health Checks from this guide Security In this section, we discuss the InterSystems IRIS default configuration deployed by this guide, general best practices, and options for securing your solution on AWS. Data in Private Subnets InterSystems IRIS EC2 instances must be placed in Private subnets and accessed only via Bastion Host or by applications via the Load Balancer. Encrypting IRIS Data at Rest On database instances running InterSystems IRIS, data is stored at rest in underlying EBS volumes which are encrypted. This CloudFormation template creates EBS volumes encrypted with the account-default AWS managed Key, named aws/ebs. Encrypting IRIS data in transit This CloudFormation does not secure Client-Server and Instance-to-Instance connections. Should data in transit encryption be required, follow the steps outlined below after the deployment is completed. Enabling SSL for SuperServer connections (JDBC/ODBC connections): https://docs.intersystems.com/irislatest/csp/docbook/Doc.View.cls?KEY=GCAS_ssltls#GCAS_ssltls_superserver. Durable multi-AZ configuration traffic between IRIS EC2 instances may need to be encrypted too. This can be achieved either by enabling SSL Encryption for mirroring: https://docs.intersystems.com/irislatest/csp/docbook/Doc.View.cls?KEY=GCAS_ssltls#GCAS_ssltls_mirroring or switching to the r5n family of instances which provides automatic encryption of instance-to-instance traffic. You can use AWS Certificate Manager (ACM) to easily provision, manage, and deploy Secure Sockets Layer/Transport Layer Security (SSL/TLS) certificates. Secure access to IRIS Management Portal By default, the IRIS management portal is accessed only via Bastion Host. Logging/Auditing/Monitoring InterSystems IRIS stores logging information in the messages.log file. CloudFormation does not setup any additional logging/monitoring services. We recommend that you enable structured logging as outlined here: https://docs.intersystems.com/irislatest/csp/docbook/Doc.View.cls?KEY=ALOG. The CloudFormation template does not install InterSystems IRIS-CloudWatch integration. InterSystems recommends using InterSystems IRIS-CloudWatch integration from https://github.com/antonum/CloudWatch-IRIS. This enables collection of IRIS metrics and logs from the messages.log file into AWS CloudWatch. The CloudFormation template does not enable AWS CloudTrail logs. You can enable CloudTrail logging by navigating to the CloudTrail service console and enabling CloudTrail logs. With CloudTrail, activity related to actions across your AWS infrastructure are recorded as an event in CloudTrail. This helps you enable governance, compliance, and operational and risk auditing of your AWS account. Reference: https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-user-guide.html InterSystems recommends monitoring of InterSystems IRIS logs and metrics, and alerting on at least the following indicators: severity 2 and 3 messages license consumption disk % full for journals and databases Write Daemon status Lock Table status In addition to the above, customers are encouraged to identify their own monitoring and alert metrics and application-specific KPIs. Sizing/Cost This guide will create the AWS resources outlined in the Deployment Assets section of this document. You are responsible for the cost of AWS services used while running this deployment. The minimum viable configuration for an InterSystems IRIS deployment provides high availability and security. The template in this guide is using the BYOL (Bring Your Own License) InterSystems IRIS licensing model. You can access Pay Per Hour IRIS Pricing at the InterSystems IRIS Marketplace page: https://aws.amazon.com/marketplace/pp/B07XRX7G6B?qid=1580742435148&sr=0-3 For details on BYOL pricing, please contact InterSystems at: https://www.intersystems.com/who-we-are/contact-us/. The following AWS assets are required to provide a functional platform: 3 EC2 Instances (including EBS volumes and provisioned IOPS) 1 Elastic Load Balancer The following table outlines recommendations for EC2 and EBS capacity built into the deployment CloudFormation template, as well as AWS resources costs (Units $/Month). Workload Dev/Test Prod Small Prod Medium Prod Large EC2 DB* m5.large 2 * r5.large 2 * r5.4xlarge 2 * r5.8xlarge EC2 Arbiter* t3.small t3.small t3.small t3.small EC2 Bastion* t3.small t3.small t3.small t3.small EBS SYS gp2 20GB gp2 50GB io1 512GB 1,000iops io1 600GB 2,000iops EBS DB gp2 128GB gp2 128GB io1 1TB 10,000iops io1 4TB 10,000iops EBS JRN gp2 64GB gp2 64GB io1 256GB 1,000iops io1 512GB 2,000iops Cost Compute 85.51 199.71 1506.18 2981.90 Cost EBS vol 27.20 27.20 450.00 1286.00 Cost EBS IOPS - - 1560.00 1820.00 Support (Basic) - - 351.62 608.79 Cost Total 127.94 271.34 3867.80 6696.69 Calculator link Calculator Calculator Calculator Calculator *All EC2 instances include additional 20GB gp2 root EBS volume AWS cost estimates are based on On-Demand pricing in the North Virginia Region. Cost of snapshots and data transfer are not included. Please consult AWS Pricing for the latest information. Deployment Assets Deployment Options The InterSystems IRIS CloudFormation template provides two different deployment options. The multi-AZ deployment option provides a highly available redundant architecture that is suitable for production workloads. The single-AZ deployment option provides a lower cost alternative that is suitable for development or test workloads. Deployment Assets (Recommended for Production) The InterSystems IRIS deployment is executed via a CloudFormation template that receives input parameters and passes them to the appropriate nested template. These are executed in order based on conditions and dependencies. AWS Resources Created: VPC Security Groups EC2 Instances for IRIS nodes and Arbiter Amazon Elastic Load Balancing (Amazon ELB) Network Load Balancer (NLB) CloudFormation Template Input Parameters General AWS EC2 Key Name Pair EC2 Instance Role S3 Name of S3 bucket where the IRIS distribution file and license key are located Network The individual VPC and Subnets where resources will be launched Database Database Master Password EC2 instance type for Database nodes Stack Creation There are four outputs for the master template: the JDBC endpoint that can be used to connect JDBC clients to InterSystems IRIS, the public IP of the Bastion Host and private IP addresses for both IRIS nodes. Clean Up Follow the AWS CloudFormation Delete documentation to delete the resources deployed by this document Delete any other resources that you manually created to integrate or assist with the deployment, such as S3 bucket and VPC Testing the Deployment Health Checks Follow the template output links to Node 01/02 Management Portal. Login with the username: SuperUser and the password you selected in the CloudFormation template. Navigate to System Administration -> Configuration -> Mirror Settings -> Edit Mirror. Make sure the system is configured with two Failover members. Verify that the mirrored database is created and active. System Administration -> Configuration -> Local Databases. Validate the JDBC connection by following the “First Look JDBC” document: https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=AFL_jdbc to validate JDBC connectivity to IRIS via the Load Balancer. Make sure to change the url variable to the value displayed in the template output, and password from “SYS” to the one you selected during setup. Failover Test On the Node02, navigate to the Management Portal (see “Health Check” section above) and open the Configuration->Edit Mirror page. At the bottom of the page you will see This member is the backup. Changes must be made on the primary. Locate the Node01 instance in the AWS EC2 management dashboard. Its name will be of the format: MyStackName-Node01-1NGXXXXXX Restart the Node01 instance. This will simulate an instance/AZ outage. Reload Node02 “Edit Mirror” page. The status should change to: This member is the primary. Changes will be sent to other members. Backup and Recovery Backup CloudFormation deployment does not enable backups for InterSystems IRIS. We recommend backing up IRIS EBS volumes using EBS Snapshot - https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSSnapshots.html - in combination with IRIS Write Daemon Freeze: https://docs.intersystems.com/irislatest/csp/docbook/Doc.View.cls?KEY=GCDI_backup#GCDI_backup_methods_ext. Instance Failure Unhealthy IRIS instances are detected by IRIS mirroring and Load Balancer, and traffic is redirected to another mirror node. Instances that are capable of recovery will rejoin the mirror and continue normal operations. If you encounter persistently unhealthy instances, please see our Knowledge Base and the “Emergency Maintenance” section of this guide. Availability-Zone Failure In the event of an availability-zone failure, temporary traffic disruptions may occur. Similar to instance failure, IRIS mirroring and Load Balancer would handle the event by switching traffic to the IRIS instance in the remaining available AZ. Region Failure The architecture outlined in this guide does not deploy a configuration that supports multi-region operation. IRIS asynchronous mirroring and AWS Route53 can be used to build configurations capable of handling region failure with minimal disruption. Please refer to https://community.intersystems.com/post/intersystems-iris-example-reference-architectures-amazon-web-services-aws for details. RPO/RTO Recovery Point Objective (RPO) Single node Dev/Test configuration is defined by the time of the last successful backup. Multi Zone Fault Tolerant setup provides Active-Active configuration that ensures full data consistency in the event of failover, with RPO of the last successful transaction. Recovery Time Objective (RTO) Backup recovery for the Single node Dev/Test configuration is outside of the scope of this deployment guide. Please refer to https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-restoring-volume.html for details on restoring EBS volume snapshots. RTO for Multi Zone Fault Tolerant setup is typically defined by the time it takes for the Elastic Load Balancer to redirect traffic to the new Primary Mirror node of the IRIS cluster. You can further reduce RTO time by developing mirror-aware applications or adding an Application Server Connection to the mirror: https://docs.intersystems.com/irislatest/csp/docbook/Doc.View.cls?KEY=GHA_mirror#GHA_mirror_set_configecp. Storage Capacity IRIS Journal and Database EBS volumes can reach storage capacity. InterSystems recommends monitoring Journal and Database volume state using the IRIS Dashboard, as well as Linux file-system tools such as df. Both Journal and Database volumes can be expanded following the EBS guide https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-modify-volume.html. Note: both EBS volume expansion and Linux file system extension steps need to be performed. Optionally, after a database backup is performed, journal space can be reclaimed by running Purge Journals: https://docs.intersystems.com/irislatest/csp/docbook/Doc.View.cls?KEY=GCDI_journal#GCDI_journal_tasks. You can also consider enabling CloudWatch Agent on your instances to monitor disk space (not enabled by this CloudFormation template): https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Install-CloudWatch-Agent.html. Security certificate expiration You can use AWS Certificate Manager (ACM) to easily provision, deploy, manage, and monitor expiration of Secure Sockets Layer/Transport Layer Security (SSL/TLS) certificates. Certificates must be monitored for expiration. InterSystems does not provide an integrated process for monitoring certificate expiration. AWS provides a CloudFormation template that can help setup an alarm. Please visit the following link for details: https://docs.aws.amazon.com/config/latest/developerguide/acm-certificate-expiration-check.html. Routine Maintenance For IRIS upgrade procedures in mirrored configurations, please refer to: https://docs.intersystems.com/irislatest/csp/docbook/Doc.View.cls?KEY=GCI_upgrade#GCI_upgrade_tasks_mirrors. InterSystems recommends following the best practices of AWS and InterSystems for ongoing tasks, including: Access key rotation Service limit evaluation Certificate renewals IRIS License limits and expiration https://docs.intersystems.com/irislatest/csp/docbook/Doc.View.cls?KEY=GCM_dashboard Storage capacity monitoring https://docs.intersystems.com/irislatest/csp/docbook/Doc.View.cls?KEY=GCM_dashboard. Additionally, you might consider adding CloudWatch Agent to your EC2 instances: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Install-CloudWatch-Agent.html. Emergency Maintenance If EC2 instances are available, connect to the instance via bastion host. Note: The public IP of the bastion host may change after an instance stop/start. That does not affect availability of the IRIS cluster and JDBC connection. For command line access, connect to the IRIS nodes via bastion host: $ chmod 400 <my-ec2-key>.pem $ ssh-add <my-ec2-key>.pem $ ssh -J ec2-user@<bastion-public-ip> ec2-user@<node-private-ip> -L 52773:1<node-private-ip>:52773 After that, the Management Portal for the instance would be available at: http://localhost:52773/csp/sys/%25CSP.Portal.Home.zen User: SuperUser, and the password you entered at stack creation. To connect to the IRIS command prompt use: $ iris session iris Consult InterSystems IRIS Management and Monitoring guide: https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=GCM. Contact InterSystems Support. If EC2 instances are not available/reachable, contact AWS Support. NOTE: AZ or instance failures will automatically be handled in our Multi-AZ deployment. Support Troubleshooting I cannot “Create stack” in CloudFormation Please check that you have the appropriate permissions to “Create Stack”. Contact your AWS account admin for permissions, or AWS Support if you continue to encounter this issue. Stack is being created, but I can’t access IRIS It takes approximately 2 minutes from the moment EC2 instance status turns into “CREATE COMPLETED” to the moment IRIS is fully available. SSH to the EC2 Node instances and check if IRIS is running: $iris list If you don’t see any active IRIS instances, or the message “iris: command not found” appears, then IRIS installation has failed. Check $cat /var/log/cloud-init-output.log on the instance to identify any problems with the IRIS installation during instance first start. IRIS is up, but I can’t access either the Management Portal or connect from my [Java] application Make sure that the Security Group created by CloudFormation lists your source IP address as allowed. Contact InterSystems Support InterSystems Worldwide Response Center (WRC) provides expert technical assistance. InterSystems IRIS support is always included with your IRIS subscription. Phone, email and online support are always available to clients 24 hours a day, 7 days a week. We maintain support advisers in 15 countries around the world and have specialists fluent in English, Spanish, Portuguese, Italian, Welsh, Arabic, Hindi, Chinese, Thai, Swedish, Korean, Japanese, Finnish, Russian, French, German, Hebrew, and Hungarian. Every one of our clients immediately gets help from a highly qualified support specialist who really cares about client success. For Immediate Help Support phone: +1-617-621-0700 (US) +44 (0) 844 854 2917 (UK) 0800615658 (NZ Toll Free) 1800 628 181 (Aus Toll Free) Support email: support@intersystems.com Support online: WRC Direct Contact support@intersystems.com for a login. Appendix IAM Policy for EC2 instance The following IAM policy allows the EC2 instance to read objects from the S3 bucket ‘my-bucket’, and write logs to CloudWatch: { "Version": "2012-10-17", "Statement": [ { "Sid": "S3BucketReadOnly", "Effect": "Allow", "Action": ["s3:GetObject"], "Resource": "arn:aws:s3:::my-bucket/*" }, { "Sid": "CloudWatchWriteLogs", "Effect": "Allow", "Action": [ "logs:CreateLogGroup", "logs:CreateLogStream", "logs:PutLogEvents", "logs:DescribeLogStreams" ], "Resource": "arn:aws:logs:*:*:*" } ] } Hi @Anton.Umnikov excellent work on this (and a lot of it too). I was wondering if you can check the stack into an intersystems github repo so I can suggest some changes and additions to the CF Template through a PR? If not I can create one out of band too but thought it would be nice since its available to have it hosted in CC.
Article
Ben Spead · Dec 20, 2023

Leveraging your InterSystems Login Account to Up your Technical Game

Your may not realize it, but your InterSystems Login Account can be used to access a very wide array of InterSystems services to help you learn and use InterSystems IRIS and other InterSystems technologies more effectively. Continue reading to learn more about how to unlock new technical knowledge and tools using your InterSystems Login account. Also - after reading, please participate in the Poll at the bottom, so we can see how this article was useful to you! What is an InterSystems Login Account? An InterSystems Login account is used to access various online services which serve InterSystems prospects, partners, and customers. It is a single set of credentials used across 15+ externally facing applications. Some applications (like the WRC, or iService) require specific activation for access to be granted by the account. Chances are there is are resources which will help you but you didn't know about - make sure to read about all of the options and try out a new tool to help up your technical game!! Application Catalog You can view all services available to you with your InterSystems Login account by visiting that InterSystems Application Catalog, located at: https://Login.InterSystems.com. This will list only those applications or services to which you currently have access. It remembers your most frequently used applications and lists them at the top for your convenience. Make sure to Bookmark the page for easy access to all of these tools in your InterSystems Login Account toolbox! Application Details Now it's time to get into the details of the individual applications and how they can help you as a developer working with InterSystems technologies! Read on and try to find a new application to leverage for the first time in order to improve your efficiency and skills as a developer.... Getting Started - gettingstarted.intersystems.com Audience Anyone wishing to explore using InterSystems IRIS® data platform Description Learn how to build data-intensive, mission-critical applications fast with InterSystems IRIS. Work through videos and tutorials leveraging SQL, Java, C#/.Net, Node.js, Python, or InterSystems ObjectScript. Use a free, cloud-based, in-browser Sandbox -- IRIS+IDE+Web Terminal—to work through tutorials. How it helps Up Your Technical Game Quickly get oriented with InterSystems technology and see it in action with real working code and examples! Explore the use of other popular programming languages with InterSystems IRIS. Online Learning - learning.intersystems.com Audience All users and potential users of InterSystems products Description Self-paced materials to help you build and support the world's most important applications: Hands-on exercises Videos Online Courses Learning Paths How it helps Up Your Technical Game Learn, learn, learn!! Nothing will help you become a more effective developer faster than following a skilled technical trainer as they walk you through new concepts to use in your InterSystems IRIS projects! Documentation - docs.intersystems.com Audience All users and potential users of InterSystems products Description Documentation for all versions of our products Links where needed to external documentation All recent content, is fed through our new search engine. Search page lets you filter by product, version, and other facets. Certain docs require authorization (via InterSystems Login account): AtScale docs available to Adaptive Analytics customers HealthShare docs are available to HealthShare users Make sure to make use of the new dynamic Upgrade Impact Checklist within the Docs server! How it helps Up Your Technical Game Quickly make use of class reference material and API documentation. Find example code. Read detailed usage documentation for parts of InterSystems IRIS into which you need a deeper dive. Request additional detail or report issues direct from within the documentation pages via the "Feedback" feature. Evaluation - evaluation.intersystems.com Audience Those wishing to download InterSystems software or licenses for evaluation or development use Description Downloads of InterSystems IRIS and InterSystems IRIS for Health. Anybody can download Community Edition kits. Existing customers can also request a powerful license to evaluate enterprise features. Preview versions are available pre-release. Early Access Program packages allow people to provide feedback on future products and features. How it helps Up Your Technical Game Try out Preview versions of software to see how new features can help to accelerate your development. Test run Enterprise features by requesting an evaluation license. Make sure all developers in your organization have the latest version of InterSystems IRIS installed on their machines. Provide feedback to InterSystems Product Management about Early Access Features to ensure that they will meet your team's needs once they are fully released. Developer Community - community.intersystems.com Audience Anyone working with InterSystems technology (InterSystems employees, customers, partners, and prospects) Description Monitor announcements related to InterSystems products and services. Find articles on a variety of technical topics. Ask questions and get answers from the community. Explore job postings or developers available for hire. Participate in competitions featuring $1000’s in cash prizes. Stay up to date concerning all things InterSystems! How it helps Up Your Technical Game With access to the leading global experts on InterSystems technology, you can learn from the best and stay engaged with the hottest questions, trends and topics. Automatically get updates in your inbox on new products, releases, and Early Access Program opportunities. Get help from peers to answer your questions and move past blockers. Have enriching discussions with InterSystems Product Managers and Product Developers - learn from the source! Push your skills to the next level by sharing technical solutions and sharing code and gaining from feedback from others. InterSystems Ideas - ideas.intersystems.com Audience Those looking to share ideas for improving InterSystems technology. Description Post ideas on how to make InterSystems technology better. Read existing reviews and up-vote or engage in discussions. InterSystems will take the most popular ideas into account for future product roadmaps. How it helps Up Your Technical Game See your ideas and needs turned into a reality within InterSystems products or open source libraries. Become familiar with the ideas of your peers and learn to use InterSystems products in new ways. Implement ideas suggested by others, new exploring parts of InterSystems technology. Global Masters - globalmasters.intersystems.com Audience Those wishing to advocate for InterSystems technology and earn badges and swag Description Gamification platform designed for developers to learn, stay up-to-date and get recognition for contributions via interactive content. Users receive points and badges for: Engagement on the Developer Community Engagement on the Open Exchange Publishing posts to social media about InterSystems products and technologies Trade in points for InterSystems swag or free training How it helps Up Your Technical Game Challenges bring to your attention articles or videos which you may have missed on the Developer Community, Learning site or YouTube channel - constantly learning new things to apply to your projects! Open Exchange - openexchange.intersystems.com Audience Developers seeking to publish or make use of reusable software packages and tools Description Developer tools and packages built with InterSystems data platforms and products. Packages are published under a variety of software licenses (mostly open source). Integrated with GitHub for package versioning, discussions, and bug tracking. Read and submit reviews and find the most popular packages. Developers can submit issues and make improvements to packages via GitHub pull requests to help push community software forward. Developers can see statistics of traffic and downloads of the packages they published How it helps Up Your Technical Game Don't reinvent the wheel! Use open source packages created and maintained by the InterSystems Community to solve generic problems, leaving you to focus on developing solutions needed specifically by your business. Contributing to open source packages is a great way to receive constructive feedback on your work and refine your development patterns. Becoming a respected contributor to open source projects is a great way to see demand increase for your skills and insights. WRC - wrc.intersystems.com Audience Issue tracking system for all customer reported problems on InterSystems IRIS and InterSystems HealthShare. Customers with SUTA can work directly with the application. Description Worldwide Response Center application (aka “WRC Direct”). Issue tracking system for all customer reported problems. Open new requests. See all investigative actions and add information and comments about a request. See statistical information about your support call history. Close requests and provide feedback about the support process. Review ad-hoc patch files. Monitor software change requests. Download current product and client software releases. How it helps Up Your Technical Game InterSystems Support Engineers can help you get past any technical blocker you have concerning development or systems management with InterSystems products. Report bugs to ensure that issues are fixed in future releases. iService - iservice.intersystems.com Audience Customers requiring support under an SLA agreement Description A support ticketing platform for our healthcare, cloud and hosted customers. Allows for rule driven service-level agreement (SLA) compliance calculation and reporting. Provides advanced facet search and export functionality. Incorporates a full Clinical Safety management system. How it helps Up Your Technical Game InterSystems Support Engineers can help you get past any technical blocker you have concerning development or systems management with InterSystems healthcare or cloud products. Report bugs to ensure that issues are fixed in future releases. ICR - containers.intersystems.com Audience Anyone who wants to use InterSystems containers Description InterSystems Container Registry A programmatically accessible container registry and web UI for browsing. Community Edition containers available to everyone. Commercial versions of InterSystems IRIS and InterSystems IRIS for Health available for supported customers. Generate tokens to use in CICD pipelines for automatically fetching containers. How it helps Up Your Technical Game Increase the maturity of your SDLC by moving to container-based CICD pipelines for your development, testing and deployment! Partner Directory - partner.intersystems.com Audience Those looking to find an InterSystems partner or partner’s product Partners looking to advertise their software and services Description Search for all types of InterSystems partners: Implementation Partners Solution Partners Technology Partners Cloud Partner Existing partners can manage their service and software listings. How it helps Up Your Technical Game Bring in certified experts on a contract basis to learn from them on your projects. License enterprise solutions based on InterSystems technology so you don't have to build everything from scratch. Bring your products and services to a wider audience, increasing demand and requiring you to increase your ability to deliver! CCR - ccr.intersystems.com Audience Select organizations managing changes made to an InterSystems implementation (employees, partners and end users) Description Change Control Record Custom workflow application built on our own technology to track all customizations to InterSystems healthcare products installed around the world. Versioning and deployment of onsite custom code and configuration changes. Multiple Tiers and workflow configuration options. Very adaptable to meet the specific needs of the phase of the project How it helps Up Your Technical Game For teams authorized for its use, find and reuse code or implementation plans within your organization, preventing having to solve the same problem multiple times. Resolve issues in production much more quickly, leaving more time for development work. Client Connection - client.intersystems.com Audience Available to any TrakCare clients Description InterSystems Client Connection is a collaboration and knowledge-sharing platform for TrakCare clients. Online community for TrakCare clients to build more, better, closer connections. On Client Connection you will find the following: TrakCare news and events TrakCare release materials, e.g. release documentation and preview videos Access to the most up-to-date product guides. Support materials to grow personal knowledge. Discussion forums to leverage peer expertise. How it helps Up Your Technical Game Technical and Application Specialists at TrakCare sites can share questions and knowledge quickly - connecting with other users worldwide. Faster answers means more time to build solutions! Online Ordering - store.intersystems.com Audience Operations users at selected Application partners/end-users Description Allow customers to pick different products according to their contracts and create new orders. Allow customers to upgrade/trade-in existing orders. Submit orders to InterSystems Customer Operations to process them for delivery and invoicing. Allow customers to migrate existing licenses to InterSystems IRIS. How it helps Up Your Technical Game Honestly, it doesn't! It's a tool used by operations personnel and not technical users, but it is listed here for completeness since access is controlled via the InterSystems Login Account ;) Other Things to Know About your InterSystems Login Account Here are a few more useful facts about InterSystems Login Accounts... How to Create a Login Account Users can make their own account by clicking "Create Account" on any InterSystems public-facing application, including: https://evaluation.intersystems.com https://community.intersystems.com https://learning.intersystems.com Alternatively, the InterSystems FRC (First Response Center) will create a Login Account for supported customers the first time they need to access the Worldwide Response Center (WRC) or iService (or supported customers can also create accounts for their colleagues). Before using an account, a user must accept the Terms and Conditions, either during the self-registration process or the first time they log in. Alternative Login Options Certain applications allow login with Google or GitHub: Developer Community Open Exchange Global Masters This is the same InterSystems Login Account, but with authentication by Google or GitHub. Account Profile If you go to https://Login.InterSystems.com and authenticate, you will be able to access Options > Profile and make basic changes to your account. Email can be changed via Options > Change Email. Resolving Login Account Issues Issues with InterSystems Login Accounts should be directed to Support@InterSystems.com. Please include: Username used for attempted login Email Browser type and version Specific error messages and/or screenshots Time and date the error was received Please remember to vote in the poll once you read the article! Feel free to ask questions here about apps that may be new to you. Best part for me is having one spot instead of trying to remember the links to all the pieces. Thanks @Mindy.Caldwell - that is the goal! Glad you find it to be useful :) 💡 This article is considered as InterSystems Data Platform Best Practice.
Announcement
Evgeny Shvarov · May 12

Technology Bonuses for the InterSystems FHIR and Digital Health Interoperability Contest 2025

Hi Developers! Here are the technology bonuses for the InterSystems FHIR and Digital Health Interoperability Contest 2025 that will give you extra points in the voting: InterSystems FHIR usage - 3 Digital Health Interoperability - 4 Vector Search - 3 LLM AI or LangChain usage: Chat GPT, Gemini and others - 3 Embedded Python - 2 Docker container usage - 2 IPM Package deployment - 2 Online Demo - 2 Implement InterSystems Community Idea - 4 Find a bug in InterSystems FHIR server - 2 Find a bug in InterSystems Interoperability - 2 New First Article on Developer Community - 2 New Second Article on Developer Community - 1 First Time Contribution - 3 Video on YouTube - 3 See the details below. InterSystems FHIR usage - 3 points Implement InterSystems FHIR server in your application either as a standalone cloud FHIR server or as a component of InterSystems IRIS for Health and collect 3 bonus points! Digital Health Interoperability - 4 points Collect 4 bonus points if your application is a healthcare interoperability solution that uses InterSystems Interoperability to transfer or/and transform healthcare data via messages or it uses healthcare format data transformation. Here are a couple of examples: one, two, three. Vector Search - 3 points Starting from the 2024.1 release, InterSystems IRIS contains a new technology vector search that allows building vectors over InterSystems IRIS data and performing a search of already indexed vectors. Use it in your solution and collect 3 bonus points. Here is the demo project that leverages it. LLM AI or LangChain usage: Chat GPT, Bard, and others - 2 points Collect 3 bonus expert points for building a solution that uses LangChain libs or Large Language Models (LLM) such as ChatGPT, Bard and other AI engines like PaLM, LLaMA, and more. AutoGPT usage count,s too. A few examples already could be found in Open Exchange: iris-openai, chatGPT telegram bot, rag-demo. Here is an article with langchain usage example. Embedded Python - 2 points Use Embedded Python in your application and collect 2 extra points. Base template, example application with Interoperability. Docker container usage - 2 points The application gets a 'Docker container' bonus if it uses InterSystems IRIS running in a docker container. Here is the simplest template to start from. ZPM Package deployment - 2 points You can collect the bonus if you build and publish the ZPM(InterSystems Package Manager) package for your Full-Stack application so it could be deployed with: zpm "install your-multi-model-solution" command on IRIS with ZPM client installed. ZPM client. Documentation. Online Demo of your project - 2 pointsCollect 2 more bonus points if you provision your project to the cloud as an online demo at any public hosting. Implement Community Opportunity Idea - 4 points Implement any idea from the InterSystems Community Ideas portal which has the "Community Opportunity" status. This will give you 4 additional bonus points. Find a bug in InterSystems Digital Health Interoperability - 2 pointsWe want the broader adoption of InterSystems Interoperability engine so we encourage you to report the bugs you will face during the development of your interoperability application with IRIS in order to fix it. Please submit the bug here in a form of issue and how to reproduce it. You can collect 2 bonus points for the first reproducible bug. Find a bug in InterSystems FHIR Server - 2 pointsWe want the broader adoption of InterSystems FHIR, so we encourage you to report the bugs you will face during the development of your FHIR application in order to fix them. Please submit the bug here in a form of an issue and how to reproduce it. You can collect 2 bonus points for the first reproducible bug. New First Article on Developer Community - 2 points Write a brand new article on Developer Community that describes the features of your project and how to work with it. Collect 2 points for the article. New Second Article on Developer Community - 1 point You can collect one more bonus point for the second new article or the translation regarding the application. The 3rd and more will not bring more points but the attention will all be yours. First-Time Contribution - 3 points Collect 3 bonus points if you participate in InterSystems Open Exchange contests for the first time! Video on YouTube - 3 points Make new YouTube videos that demonstrate your product in action and collect 3 bonus points for each. The list of bonuses is subject to change. Stay tuned! Good luck in the competition!
Article
Mark Bolinsky · Feb 12, 2019

InterSystems IRIS Example Reference Architectures for Amazon Web Services (AWS)

The Amazon Web Services (AWS) Cloud provides a broad set of infrastructure services, such as compute resources, storage options, and networking that are delivered as a utility: on-demand, available in seconds, with pay-as-you-go pricing. New services can be provisioned quickly, without upfront capital expense. This allows enterprises, start-ups, small and medium-sized businesses, and customers in the public sector to access the building blocks they need to respond quickly to changing business requirements. Updated: 10-Jan, 2023 The following overview and details are provided by Amazon and can be found here. Overview AWS Global Infrastructure The AWS Cloud infrastructure is built around Regions and Availability Zones (AZs). A Region is a physical location in the world where we have multiple AZs. AZs consist of one or more discrete data centers, each with redundant power, networking, and connectivity, housed in separate facilities. These AZs offer you the ability to operate production applications and databases that are more highly available, fault tolerant, and scalable than would be possible from a single data center. Details of AWS Global Infrastructure can be found here. AWS Security and Compliance Security in the cloud is much like security in your on-premises data centers—only without the costs of maintaining facilities and hardware. In the cloud, you don’t have to manage physical servers or storage devices. Instead, you use software-based security tools to monitor and protect the flow of information into and of out of your cloud resources. The AWS Cloud enables a shared responsibility model. While AWS manages security of the cloud, you are responsible for security in the cloud. This means that you retain control of the security you choose to implement to protect your own content, platform, applications, systems, and networks no differently than you would in an on-site data center. Details of AWS Cloud Security can be found here. The IT infrastructure that AWS provides to its customers is designed and managed in alignment with best security practices and a variety of IT security standards. A complete list of assurance programs with which AWS complies with can be found here. AWS Cloud Platform AWS consists of many cloud services that you can use in combinations tailored to your business or organizational needs. The following sub-section introduces the major AWS services by category that are commonly used with InterSystems IRIS deployments. There are many other services available and potentially useful for your specific application. Be sure to research those as needed. To access the services, you can use the AWS Management Console, the Command Line Interface, or Software Development Kits (SDKs). AWS Cloud Platform Component Details AWS Management Console Details of the AWS Management Console can be found here. AWS Command-line interface Details of the AWS Command Line Interface (CLI) can be found here. AWS Software Development Kits (SDK) Details of AWS Software Development Kits (SDK) can be found here. AWS Compute There are numerous options available: Details of Amazon Elastic Cloud Computing (EC2) can be found here Details of Amazon EC2 Container Service (ECS) can be found here Details of Amazon EC2 Container Registry (ECR) can be found here Details of Amazon Auto Scaling can be found here AWS Storage There are numerous options available: Details of Amazon Elastic Block Store (EBS) can be found here Details of Amazon Simple Storage Service (S3) can be found here Details of Amazon Elastic File System (EFS) can be found here AWS Networking There are numerous options available. Details of Amazon Virtual Private Cloud (VPC) can be found here Details of Amazon Elastic IP Addresses can be found here Details of Amazon Elastic Network Interfaces can be found here Details of Amazon Enhanced Networking for Linux can be found here Details of Amazon Elastic Load Balancing (ELB) can be found here Details of Amazon Route 53 can be found here InterSystems IRIS Sample Architectures As part of this article, sample InterSystems IRIS deployments for AWS are provided as a starting point for your application specific deployment. These can be used as a guideline for numerous deployment possibilities. This reference architecture demonstrates highly robust deployment options starting with the smallest deployments to massively scalable workloads for both compute and data requirements. High availability and disaster recovery options are covered in this document along with other recommended system operations. It is expected these will be modified by the individual to support their organization’s standard practices and security policies. InterSystems is available for further discussions or questions of AWS-based InterSystems IRIS deployments for your specific application. Sample Reference Architectures The following sample architectures will provide several different configurations with increasing capacity and capabilities. Consider these examples of small development / production / large production / production with sharded cluster that show the progression from starting with a small modest configuration for development efforts and then growing to massively scalable solutions with proper high availability across zones and multi-region disaster recovery. In addition, an example architecture of using the new sharding capabilities of InterSystems IRIS Data Platform for hybrid workloads with massively parallel SQL query processing. Small Development Configuration In this example, a minimal configuration is used to illustrates a small development environment capable of supporting up to 10 developers and 100GB of data. More developers and stored data can easily be supported by simply changing the virtual machine instance type and increasing storage of the EBS volume(s) as appropriate. This is adequate to support development efforts and become familiar with InterSystems IRIS functionality along with Docker container building and orchestration if desired. High availability with database mirroring is typically not used with a small configuration, however it can be added at any time if high availability is needed. Small Configuration Sample Diagram The below sample diagram in Figure 2.1.1-a illustrates the table of resources in Figure 2.1.1-b. The gateways included are just examples, and can be adjusted accordingly to suit your organization’s standard network practices. Figure-2.1.1-a: Sample Small Development Architecture The following resources within the AWS VPC are provisioned as a minimum small configuration. AWS resources can be added or removed as required. Small Configuration AWS Resources Sample of Small Configuration AWS resources is provided below in the following table. Proper network security and firewall rules need to be considered to prevent unwanted access into the VPC. Amazon provides network security best practices for getting started which can be found here: https://docs.aws.amazon.com/vpc/index.html#lang/en_us https://docs.aws.amazon.com/quickstart/latest/vpc/architecture.html#best-practices Note: VM instances require a public IP address to reach AWS services. While this practice might raise some concerns, AWS recommends limiting the incoming traffic to these VM instances by using firewall rules. If your security policy requires truly internal VM instances, you will need to set up a NAT proxy manually on your network and a corresponding route so that the internal instances can reach the Internet. It is important to note that you cannot connect to a fully internal VM instance directly by using SSH. To connect to such internal machines, you must set up a bastion instance that has an external IP address and then tunnel through it. A bastion Host can be provisioned to provide the external facing point of entry into your VPC. Details of using a bastion hosts can he found here: https://aws.amazon.com/blogs/security/controlling-network-access-to-ec2-instances-using-a-bastion-server/ https://docs.aws.amazon.com/quickstart/latest/linux-bastion/architecture.html Production Configuration In this example, a more sizable configuration as an example production configuration that incorporates InterSystems IRIS database mirroring capability to support high availability and disaster recovery. Included in this configuration is a synchronous mirror pair of InterSystems IRIS database servers split between two availability zones within region-1 for automatic failover, and a third DR asynchronous mirror member in region-2 for disaster recovery in the unlikely event an entire AWS region is offline. Details of a multiple Region with Multi-VPC Connectivity can be found here. The InterSystems Arbiter and ICM server deployed in a separate third zone for added resiliency. The sample architecture also includes a set of optional load balanced web servers to support a web-enabled application. These web servers with the InterSystems Gateway can be scaled independently as needed. Production Configuration Sample Diagram The sample diagram in Figure 2.2.1-a illustrates the table of resources in Figure 2.2.1-b. The gateways included are just examples, and can be adjusted accordingly to suit your organization’s standard network practices. Figure 2.2.1-a: Sample Production Architecture with High Availability and Disaster Recovery The following resources within the AWS VPC are recommended as a minimum to support a production workload for a web application. AWS resources can be added or removed as required. Production Configuration AWS Resources Sample of Production Configuration AWS resources is provided below in the following table. Large Production Configuration In this example, a massively scaled configuration is provided by expanding on the InterSystems IRIS capability to also introduce application servers using InterSystems’ Enterprise Cache Protocol (ECP) to provide massive horizontal scaling of users. An even higher level of availability is included in this example because of ECP clients preserving session details even in the event of a database instance failover. Multiple AWS availability zones are used with both ECP-based application servers and database mirror members deployed in multiple regions. This configuration is capable of supporting tens of millions database accesses per second and multiple terabytes of data. Production Configuration Sample Diagram The sample diagram in Figure 2.3.1-a illustrates the table of resources in Figure 2.3.1-b. The gateways included are just examples, and can be adjusted accordingly to suit your organization’s standard network practices. Included in this configuration is a failover mirror pair, four or more ECP clients (application servers), and one or more web servers per application server. The failover database mirror pairs are split between two different AWS availability zones in the same region for fault domain protection with the InterSystems Arbiter and ICM server deployed in a separate third zone for added resiliency. Disaster recovery extends to a second AWS region and availability zone(s) similar to the earlier example. Multiple DR regions can be used with multiple DR Async mirror member targets if desired. Figure 2.3.1-a: Sample Large Production Architecture with ECP Application Servers The following resources within the AWS VPC Project are recommended as a minimum recommendation to support a sharded cluster. AWS resources can be added or removed as required. Large Production Configuration AWS Resources Sample of Large Production Configuration AWS resources is provided below in the following table. Production Configuration with InterSystems IRIS Sharded Cluster In this example, a horizontally scaled configuration for hybrid workloads with SQL is provided by including the new sharded cluster capabilities of InterSystems IRIS to provide massive horizontal scaling of SQL queries and tables across multiple systems. Details of InterSystems IRIS sharded cluster and its capabilities are discussed further in section 9 of this article. Production with Sharded Cluster Configuration Sample Diagram The sample diagram in Figure 2.4.1-a illustrates the table of resources in Figure 2.4.1-b. The gateways included are just examples, and can be adjusted accordingly to suit your organization’s standard network practices. Included in this configuration are four mirror pairs as the data nodes. Each of the failover database mirror pairs are split between two different AWS availability zones in the same region for fault domain protection with the InterSystems Arbiter and ICM server deployed in a separate third zone for added resiliency. This configuration allows for all the database access methods to be available from any data node in the cluster. The large SQL table(s) data is physically partitioned across all data nodes to allow for massive parallelization of both query processing and data volume. Combining all these capabilities provides the ability to support complex hybrid workloads such as large-scale analytical SQL querying with concurrent ingestion of new data, all within a single InterSystems IRIS Data Platform. Figure 2.4.1-a: Sample Production Configuration with Sharded Cluster with High Availability Note that in the above diagram and the “resource type” column in the table below, the term “EC2” is an AWS term representing an AWS virtual server instance as described further in section 3.1 of this document. It does not represent or imply the use of “compute nodes” in the cluster architecture described in chapter 9. The following resources within the AWS VPC are recommended as a minimum recommendation to support a sharded cluster. AWS resources can be added or removed as required. Production with Sharded Cluster Configuration AWS Resources Sample of Production with Sharded Cluster Configuration AWS resources is provided below in the following table. Introduction to Cloud Concepts Amazon Web Services (AWS) provides a feature rich cloud environment for Infrastructure-as-a-Service (IaaS) fully capable of supporting all of InterSystems products including support for container-based DevOps with the new InterSystems IRIS Data Platform. Care must be taken, as with any platform or deployment model, to ensure all aspects of an environment are considered such as performance, availability, system operations, high availability, disaster recovery, security controls, and other management procedures. This article will cover the three major components of all cloud deployments: Compute, Storage, and Networking. Compute Engines (Virtual Machines) Within AWS EC2 there are several options available for compute engine resources with numerous virtual CPU and memory specifications and associated storage options. One item to note within AWS EC2, references to the number of vCPUs in a given machine type equates to one vCPU is one hyper-thread on the physical host at the hypervisor layer. For the purposes of this document m5* and r5* EC2 instance types will be used and are most widely available in most AWS deployment regions. However, the use of other specialized instance types such as: x1* with very large memory are great options for very large working datasets keeping massive amounts of data cached in memory, or i3* with NVMe local instance storage. Details of the AWS Service Level Agreement (SLA) can be found here. Disk Storage The storage type most directly related to InterSystems products are the persistent disk types, however local storage may be used for high levels of performance if data availability restrictions are understood and accommodated. There are several other options such as S3 (buckets) and Elastic File Store (EFS), however those are more specific to an individual application’s requirements rather than supporting the operation of InterSystems IRIS Data Platform. Like most other cloud providers, AWS imposes limitations on the amount of persistent storage that can be associated to an individual compute engine. These limits include the maximum size of each disk, the number of persistent disks attached to each compute engine, and the amount of IOPS per persistent disk with an overall individual compute engine instance IOPS cap. In addition, there are imposed IOPS limits per GB of disk space, so at times provisioning more disk capacity is required to achieve desired IOPS rate. These limits may change over time and to be confirmed with AWS as appropriate. There are three types of persistent storage types for disk volumes: EBS gp2 (SSD), EBS st1 (HDD), and EBS io1 (SSD). The standard EBS gp2 disks are more suited for production workloads that require predictable low-latency IOPS and higher throughput. Standard Persistent disks are more an economical option for non-production development and test or archive type workloads. Details of the various disk types and limitations can be found here. VPC Networking The virtual private cloud (VPC) network is highly recommended to support the various components of InterSystems IRIS Data Platform along with providing proper network security controls, various gateways, routing, internal IP address assignments, network interface isolation, and access controls. An example VPC will be detailed in the examples provided within this document. Details of VPC networking and firewalls can be found here. Virtual Private Cloud (VPC) Overview Details of AWS VPC are provided here. In most large cloud deployments, multiple VPCs are provisioned to isolate the various gateways types from application-centric VPCs and leverage VPC peering for inbound and outbound communications. It is highly recommended to consult with your network administrator for details on allowable subnets and any organizational firewall rules of your company. VPC peering is not covered in this document. In the examples provided in this document, a single VPC with three subnets will be used to provide network isolation of the various components for predictable latency and bandwidth and security isolation of the various InterSystems IRIS components. Network Gateway and Subnet Definitions Two gateways are provided in the example in this document to support both Internet and secure VPN connectivity. Each ingress access is required to have appropriate firewall and routing rules to provide adequate security for the application. Details on how to use VPC Route Tables can be found here. Three subnets are used in the provided example architectures dedicated for use with InterSystems IRIS Data Platform. The use of these separate network subnets and network interfaces allows for flexibility in security controls and bandwidth protection and monitoring for each of the three above major components. Details for creating virtual machine instances with multiple network interfaces can be found here. The subnets included in these examples: User Space Network for Inbound connected users and queries Shard Network for Inter-shard communications between the shard nodes Mirroring Network for high availability using synchronous replication and automatic failover of individual data nodes. Note: Failover synchronous database mirroring is only recommended between multiple zones which have low latency interconnects within a single AWS region. Latency between regions is typically too high for to provide a positive user experience especially for deployment with a high rate of updates. Internal Load Balancers Most IaaS cloud providers lack the ability to provide for a Virtual IP (VIP) address that is typically used in automatic database failover designs. To address this, several of the most commonly used connectivity methods, specifically ECP clients and Web Gateways, are enhanced within InterSystems IRIS to no longer rely on VIP capabilities making them mirror-aware and automatic. Connectivity methods such as xDBC, direct TCP/IP sockets, or other direct connect protocols, require the use of a VIP-like address. To support those inbound protocols, InterSystems database mirroring technology makes it possible to provide automatic failover for those connectivity methods within AWS using a health check status page called mirror_status.cxw to interact with the load balancer to achieve VIP-like functionality of the load balancer only directing traffic to the active primary mirror member, thus providing a complete and robust high availability design within AWS. Details of AWS Elastic Load Balancer (ELB) can be found here. Figure 4.2-a: Automatic Failover without a Virtual IP Address Details of using a load balancer to provide VIP-like functionality is provided here. // Update 2023-01-10: There is a new recommended VIP model for AWS that is more robust and alleviates the need for a load balancer to provide VIP-like capabilities. Details can be found here. Sample VPC Topology Combining all the components together, the following illustration in Figure 4.3-a demonstrates the layout of a VPC with the following characteristics: Leverages multiple zones within a region for high availability Provides two regions for disaster recovery Utilizes multiple subnets for network segregation Includes separate gateways for VPC Peering, Internet, and VPN connectivity Uses cloud load balancer for IP failover for mirror members Please note in AWS each subnet must reside entirely within one availability zone and cannot span zones. So, in the example below, network security or routing rules need to be properly defined. Details on AWS VPC subnets can be found here. Figure 4.3-a: Example VPC Network Topology Persistent Storage Overview As discussed in the introduction, the use of AWS Elastic Block Store (EBS) Volumes is recommended and specifically EBS gp2 or the latest gp3 volume types. EBS gp3 volumes are recommended due to the higher read and write IOPS rates and low latency required for transactional and analytical database workloads. Local SSDs may be used in certain circumstances, however beware that the performance gains of local SSDs comes with certain trade-offs in availability, durability, and flexibility. Details of Local SSD data persistence can be found here to understand the events of when Local SSD data is preserved and when not. LVM PE Striping Like other cloud providers, AWS imposes numerous limits on storage both in IOPS, space capacity, and number of devices per virtual machine instance. Consult AWS documentation for current limits which can be found here. With these limits, LVM striping becomes necessary to maximize IOPS beyond that of a single disk device for a database instance. In the example virtual machine instances provided, the following disk layouts are recommended. Performance limits associated with SSD persistent disks can be found here. Note: There is currently a maximum of 40 EBS volumes per Linux EC2 instance although AWS resource capabilities change often so please consult with AWS documentation for current limitations. Figure 5.1-a: Example LVM Volume Group Allocation The benefits of LVM striping allows for spreading out random IO workloads to more disk devices and inherit disk queues. Below is an example of how to use LVM striping with Linux for the database volume group. This example will use four disks in an LVM PE stripe with a physical extent (PE) size of 4MB. Alternatively, larger PE sizes can be used if needed. Step 1: Create Standard or SSD Persistent Disks as needed Step 2: IO scheduler is NOOP for each of the disk devices using “lsblk -do NAME,SCHED” Step 3: Identify disk devices using “lsblk -do KNAME,TYPE,SIZE,MODEL” Step 4: Create Volume Group with new disk devices vgcreate s 4M <vg name> <list of all disks just created> example: vgcreate -s 4M vg_iris_db /dev/sd[h-k] Step 4: Create Logical Volume lvcreate n <lv name> -L <size of LV> -i <number of disks in volume group> -I 4MB <vg name> example: lvcreate -n lv_irisdb01 -L 1000G -i 4 -I 4M vg_iris_db Step 5: Create File System mkfs.xfs K <logical volume device> example: mkfs.xfs -K /dev/vg_iris_db/lv_irisdb01 Step 6: Mount File System edit /etc/fstab with following mount entries /dev/mapper/vg_iris_db-lv_irisdb01 /vol-iris/db xfs defaults 0 0 mount /vol-iris/db Using the above table, each of the InterSystems IRIS servers will have the following configuration with two disks for SYS, four disks for DB, two disks for primary journals and two disks for alternate journals. Figure 5.1-b: InterSystems IRIS LVM Configuration For growth LVM allows for expanding devices and logical volumes when needed without interruption. Consult with Linux documentation on best practices for ongoing management and expansion of LVM volumes. Note: The enablement of asynchronous IO for both the database and the write image journal files are highly recommend. See the community article for details on enabling on Linux. Provisioning New with InterSystems IRIS is InterSystems Cloud Manager (ICM). ICM carries out many tasks and offers many options for provisioning InterSystems IRIS Data Platform. ICM is provided as a Docker image that includes everything for provisioning a robust AWS cloud-based solution. ICM currently support provisioning on the following platforms: Amazon Web Services including GovCloud (AWS / GovCloud) Google Cloud Platform (GCP) Microsoft Azure Resource Manager including Government (ARM / MAG) VMware vSphere (ESXi) ICM and Docker can run from either a desktop/laptop workstation or have a centralized dedicated modest “provisioning” server and centralized repository. The role of ICM in the application lifecycle is Define -> Provision -> Deploy -> Manage Details for installing and using ICM with Docker can be found here. NOTE: The use of ICM is not required for any cloud deployment. The traditional method of installation and deployment with tar-ball distributions is fully supported and available. However, ICM is recommended for ease of provisioning and management in cloud deployments. Container Monitoring ICM includes two basic monitoring facilities for container-based deployments: Rancherand Weave Scope. Neither are deployed by default, and need to be specified in the defaults file using the Monitorfield. Details for monitoring, orchestration, and scheduling with ICM can be found here. An overview of Rancher and documentation can be found here. An overview of Weave Scope and documentation can be found here. High Availability InterSystems database mirroring provides the highest level of availability in any cloud environment. AWS does not provide any availability guarantees for a single EC2 instance, so database mirroring is required database tier which can also be coupled with load balancing and auto-scale groups. Earlier sections discussed how a cloud load balancer will provide automatic IP address failover for a Virtual IP (VIP-like) capability with database mirroring. The cloud load balancer uses the mirror_status.cxwhealth check status page mentioned earlier in the Internal Load Balancerssection. There are two modes of database mirroring - synchronous with automatic failover and asynchronous mirroring. In this example, synchronous failover mirroring will be covered. The details of mirroring can he found here. The most basic mirroring configuration is a pair of failover mirror members in an arbiter-controlled configuration. The arbiter is placed in a third zone within the same region to protect from potential availability zone outages impacting both the arbiter and one of the mirror members. There are many ways mirroring can be setup specifically in the network configuration. In this example, we will use the network subnets defined previously in the Network Gateway and Subnet Definitions section of this document. Example IP address schemes will be provided in a following section and for the purpose of this section, only the network interfaces and designated subnets will be depicted. Figure 7-a: Sample mirror configuration with arbiter Disaster Recovery InterSystems database mirroring extends the capability of high available to also support disaster recovery to another AWS geographic region to support operational resiliency in the unlikely event of an entire AWS region going offline. How an application is to endure such outages depends on the recovery time objective (RTO) and recovery point objectives (RPO). These will provide the initial framework for the analysis required to design a proper disaster recovery plan. The following link provides a guide for the items to be considered when developing a disaster recovery plan for your application. https://aws.amazon.com/disaster-recovery/ Asynchronous Database Mirroring InterSystems IRIS Data Platform’s database mirroring provides robust capabilities for asynchronously replicating data between AWS availability zones and regions to help support the RTO and RPO goals of your disaster recovery plan. Details of async mirror members can be found here. Similar to the earlier high availability section, a cloud load balancer will provide automatic IP address failover for a Virtual IP (VIP-like) capability for DR asynchronous mirroring as well using the same mirror_status.cxw health check status page mentioned earlier in the Internal Load Balancers section. In this example, DR asynchronous failover mirroring will be covered along with the introduction of the AWS Route53 DNS service to provide upstream systems and client workstations with a single DNS address regardless of which availability zone or region your InterSystems IRIS deployment is operating. Details of AWS Route53 can be found here. Figure 8.1-a: Sample DR Asynchronous Mirroring with AWS Route53 In the above example, the IP addresses of both region’s Elastic Load Balancer (ELB) that front-end the InterSystems IRIS instances are provided Route53, and it will only direct traffic to whichever mirror member is the active primary mirror regardless of the availability zone or region it is located. Sharded Cluster InterSystems IRIS includes a comprehensive set of capabilities to scale your applications, which can be applied alone or in combination, depending on the nature of your workload and the specific performance challenges it faces. One of these, sharding, partitions both data and its associated cache across a number of servers, providing flexible, inexpensive performance scaling for queries and data ingestion while maximizing infrastructure value through highly efficient resource utilization. An InterSystems IRIS sharded cluster can provide significant performance benefits for a wide variety of applications, but especially for those with workloads that include one or more of the following: High-volume or high-speed data ingestion, or a combination. Relatively large data sets, queries that return large amounts of data, or both. Complex queries that do large amounts of data processing, such as those that scan a lot of data on disk or involve significant compute work. Each of these factors on its own influences the potential gain from sharding, but the benefit may be enhanced where they combine. For example, a combination of all three factors — large amounts of data ingested quickly, large data sets, and complex queries that retrieve and process a lot of data — makes many of today’s analytic workloads very good candidates for sharding. Note that these characteristics all have to do with data; the primary function of InterSystems IRIS sharding is to scale for data volume. However, a sharded cluster can also include features that scale for user volume, when workloads involving some or all of these data-related factors also experience a very high query volume from large numbers of users. Sharding can be combined with vertical scaling as well. Operational Overview The heart of the sharded architecture is the partitioning of data and its associated cache across a number of systems. A sharded cluster physically partitions large database tables horizontally — that is, by row — across multiple InterSystems IRIS instances, called data nodes, while allowing applications to transparently access these tables through any node and still see the whole dataset as one logical union. This architecture provides three advantages: Parallel processing Queries are run in parallel on the data nodes, with the results merged, combined, and returned to the application as full query results by the node the application connected to, significantly enhancing execution speed in many cases. Partitioned caching Each data node has its own cache, dedicated to the sharded table data partition it stores, rather than a single instance’s cache serving the entire data set, which greatly reduces the risk of overflowing the cache and forcing performance-degrading disk reads. Parallel loading Data can be loaded onto the data nodes in parallel, reducing cache and disk contention between the ingestion workload and the query workload and improving the performance of both. Details of InterSystems IRIS sharded cluster can be found here. Elements of Sharding and Instance Types A sharded cluster consists of at least one data node and, if needed for specific performance or workload requirements, an optional number of compute nodes. These two node types offer simple building blocks presenting a simple, transparent, and efficient scaling model. Data Nodes Data nodes store data. At the physical level, sharded table[1]data is spread across all data nodes in the cluster and non-sharded table data is physically stored on the first data node only. This distinction is transparent to the user with the possible sole exception that the first node might have a slightly higher storage consumption than the others, but this difference is expected to become negligible as sharded table data would typically outweigh non-sharded table data by at least an order of magnitude. Sharded table data can be rebalanced across the cluster when needed, typically after adding new data nodes. This will move “buckets” of data between nodes to approximate an even distribution of data. At the logical level, non-sharded table data and the union of all sharded table data is visible from any node, so clients will see the whole dataset, regardless of which node they’re connecting to. Metadata and code are also shared across all data nodes. The basic architecture diagram for a sharded cluster simply consists of data nodes that appear uniform across the cluster. Client applications can connect to any node and will experience the data as if it were local. Figure 9.2.1-a: Basic Sharded Cluster Diagram [1]For convenience, the term “sharded table data” is used throughout the document to represent “extent” data for any data model supporting sharding that is marked as sharded. The terms “non-sharded table data” and “non-sharded data” are used to represent data that is in a shardable extent not marked as such or for a data model that simply doesn’t support sharding yet. Compute Nodes For advanced scenarios where low latencies are required, potentially at odds with a constant influx of data, compute nodes can be added to provide a transparent caching layer for servicing queries. Compute nodes cache data. Each compute node is associated with a data node for which it caches the corresponding sharded table data and, in addition to that, it also caches non-sharded table data as needed to satisfy queries. Figure 9.2.2-a: Shard cluster with Compute Nodes Because compute nodes don’t physically store any data and are meant to support query execution, their hardware profile can be tailored to suit those needs, for example by emphasizing memory and CPU and keeping storage to the bare minimum. Ingestion is forwarded to the data nodes, either directly by the driver (xDBC, Spark) or implicitly by the sharding manager code when “bare” application code runs on a compute node. Sharded Cluster Illustrations There are various combinations of deploying a sharded cluster. The following high-level diagrams are provided to illustrate the most common deployment models. These diagrams do not include the networking gateways and details and provide to focus only on the sharded cluster components. Basic Sharded Cluster The following diagram is the simplest sharded cluster with four data nodes deployed in a single region and in a single zone. An AWS Elastic Load Balancer (ELB) is used to distribute client connections to any of the sharded cluster nodes Figure 9.3.1-a: Basic Sharded Cluster In this basic model, there is no resiliency or high availability provided beyond that of what AWS provides for a single virtual machine and its attached SSD persistent storage. Two separate network interface adapters are recommended to provide both network security isolation for the inbound client connections and also bandwidth isolation between the client traffic and the sharded cluster communications. Basic Sharded Cluster with High Availability The following diagram is the simplest sharded cluster with four mirrored data nodes deployed in a single region and splitting each node’s mirror between zones. An AWS Load Balancer is used to distribute client connections to any of the sharded cluster nodes. High availability is provided through the use of InterSystems database mirroring which will maintain a synchronously replicated mirror in a secondary zone within the region. Three separate network interface adapters are recommended to provide both network security isolation for the inbound client connections and bandwidth isolation between the client traffic, the sharded cluster communications, and the synchronous mirror traffic between the node pairs. Figure 9.3.2-a: Basic Sharded Cluster with High Availability This deployment model also introduces the mirror arbiter as described in an earlier section of this article. Sharded Cluster with Separate Compute Nodes The following diagram expands the sharded cluster for massive user/query concurrency with separate compute nodes and four data nodes. The Cloud Load Balancer server pool only contains the addresses of the compute nodes. Updates and data ingestion will continue to update directly to the data nodes as before to sustain ultra-low latency performance and avoid interference and congestion of resources between query/analytical workloads from real-time data ingestion. With this model the allocation of resources can be fine-tuned for scaling of compute/query and ingestion independently allowing for optimal resources where needed in a “just-in-time” and maintaining an economical yet simple solution instead of wasting resources unnecessarily just to scale compute or data. Compute Nodes lend themselves for a very straightforward use of AWS auto scale grouping (aka Autoscaling) to allow for automatic addition or deletion of instances from a managed instance group based on increased or decreased load. Autoscaling works by adding more instances to your instance group when there is more load (upscaling), and deleting instances when the need for instances is lowered (downscaling). Details of AWS Autoscaling can be found here. Figure 9.3.3-a: Sharded Cluster with Separate Compute and Data Nodes Autoscaling helps cloud-based applications gracefully handle increases in traffic and reduces cost when the need for resources is lower. Simply define the policy and the auto-scaler performs automatic scaling based on the measured load. Backup Operations There are multiple options available for backup operations. The following three options are viable for your AWS deployment with InterSystems IRIS. The first two options, detailed below, incorporate a snapshot type procedure which involves suspending database writes to disk prior to creating the snapshot and then resuming updates once the snapshot was successful. The following high-level steps are taken to create a clean backup using either of the snapshot methods: Pause writes to the database via database External Freeze API call. Create snapshots of the OS + data disks. Resume database writes via External Thaw API call. Backup facility archives to backup location Details of the External Freeze/Thaw APIs can be found here. Note: Sample scripts for backups are not included in this document, however periodically check for examples posted to the InterSystems Developer Community. www.community.intersystems.com The third option is InterSystems Online backup. This is an entry-level approach for smaller deployments with a very simple use case and interface. However, as databases increase in size, external backups with snapshot technology are recommended as a best practice with advantages including the backup of external files, faster restore times, and an enterprise-wide view of data and management tools. Additional steps such as integrity checks can be added on a periodic interval to ensure clean and consistent backup. The decision points on which option to use depends on the operational requirements and policies of your organization. InterSystems is available to discuss the various options in more detail. AWS Elastic Block Store (EBS) Snapshot Backup Backup operations can be achieved using AWS CLI command-line API along with InterSystems ExternalFreeze/Thaw API capabilities. This allows for true 24x7 operational resiliency and assurance of clean regular backups. Details for managing and creating and automation AWS EBS snapshots can be found here. Logical Volume Manager (LVM) Snapshots Alternatively, many of the third-party backup tools available on the market can be used by deploying individual backup agents within the VM itself and leveraging file-level backups in conjunction with Logical Volume Manager (LVM) snapshots. One of the major benefits to this model is having the ability to have file-level restores of either Windows or Linux based VMs. A couple of points to note with this solution, is since AWS and most other IaaS cloud providers do not provide tape media, all backup repositories are disk-based for short term archiving and have the ability to leverage blob or bucket type low cost storage for long-term retention (LTR). It is highly recommended if using this method to use a backup product that supports de-duplication technologies to make the most efficient use of disk-based backup repositories. Some examples of these backup products with cloud support include but is not limited to: Commvault, EMC Networker, HPE Data Protector, and Veritas Netbackup. InterSystems does not validate or endorses one product over the other. Online Backup For small deployments the built-in Online Backup facility is also a viable option as well. This InterSystems database online backup utility backs up data in database files by capturing all blocks in the databases then writes the output to a sequential file. This proprietary backup mechanism is designed to cause no downtime to users of the production system. Details of Online Backup can be found here. In AWS, after the online backup has finished, the backup output file and all other files in use by the system must be copied to some other storage location outside of that virtual machine instance. Bucket/Object storage is a good designation for this. There are two option for using an AWS Single Storage Space (S3) bucket. Use the AWS CLIscripting APIs directly to copy and manipulate the newly created online backup (and other non-database) files Details can be found here. Mount an Elastic File Store (EFS) volume and use it similarly as a persistent disk at a low cost. Details of EFS a can be found here. @Mark.Bolinsky ,You do insanely good work... thanks for this. Additionally, InterSystems IRIS and IRIS for Health are now available within the AWS marketplace: https://aws.amazon.com/marketplace/seller-profile?id=6e5272fb-ecd1-4111-8691-e5e24229826f Thanks gentlemen for your documentation work. @Mark Bolinsky: It would be so perfect if you could share YAML templates to choose and deploy directly some of your examples, as done here 😉