Most transactional applications have a 70:30 RW profile. However, some special cases have extremely high write IO profiles.

I ran storage IO tests in the ap-southeast-2 (Sydney) AWS region to simulate IRIS database IO patterns and throughput similar to a very high write rate application.

The test aimed to determine whether the EC2 instance types and EBS volume types available in the AWS Australian regions will support the high IO rates and throughput required.

5 0
0 583
Article
· May 25, 2023 12m read
AWS Capacity planning review example

I am often asked to review customers' IRIS application performance data to understand if system resources are under or over-provisioned.

This recent example is interesting because it involves an application that has done a "lift and shift" migration of a large IRIS database application to the Cloud. AWS, in this case.

A key takeaway is that once you move to the Cloud, resources can be right-sized over time as needed. You do not have to buy and provision on-premises infrastructure for many years in the future that you expect to grow into.

Continuous monitoring is required. Your application transaction rate will change as your business changes, the application use or the application itself changes. This will change the system resource requirements. Planners should also consider seasonal peaks in activity. Of course, an advantage of the Cloud is resources can be scaled up or down as needed.

For more background information, there are several in-depth posts on AWS and IRIS in the community. A search for "AWS reference" is an excellent place to start. I have also added some helpful links at the end of this post.

AWS services are like Lego blocks, different sizes and shapes can be combined. I have ignored networking, security, and standing up a VPC for this post. I have focused on two of the Lego block components;
- Compute requirements.
- Storage requirements.

9 1
3 568

Overview

Predictable storage IO performance with low latency is vital to provide scalability and reliability for your applications. This set of benchmarks is to inform users of IRIS considering deploying applications in AWS about EBS gp3 volume performance.

Summary

  • An LVM stripe can increase IOPS and throughput beyond single EBS volume performance limits.
  • An LVM stripe lowers read latency.
4 1
0 1.8K

YASPE is the successor to YAPE (Yet Another pButtons Extractor). YASPE has been written from the ground up with many internal changes to allow easier maintenance and enhancements.

YASPE functions:

  • Parse and chart InterSystems Caché pButtons and InterSystems IRIS SystemPerformance files for quick performance analysis of Operating System and IRIS metrics.
  • Allow a deeper dive by creating ad-hoc charts and by creating charts combining the Operating System and IRIS metrics with the "Pretty Performance" option.
  • The "System Overview" option saves you from searching your SystemPerformance files for system details or common configuration options.

YASPE is written in Python and is available on GitHub as source code or for Docker containers at:


12 1
4 424

The following steps show you how to display a sample list of metrics available from the /api/monitor service.

In the last post, I gave an overview of the service that exposes IRIS metrics in Prometheus format. The post shows how to set up and run IRIS preview release 2019.4 in a container and then list the metrics.


This post assumes you have Docker installed. If not, go and do that now for your platform :)

17 10
5 1.2K

Released with no formal announcement in IRIS preview release 2019.4 is the /api/monitor service exposing IRIS metrics in Prometheus format. Big news for anyone wanting to use IRIS metrics as part of their monitoring and alerting solution. The API is a component of the new IRIS System Alerting and Monitoring (SAM) solution that will be released in an upcoming version of IRIS.

11 2
7 1.8K

If a picture is worth a thousand words, what's a video worth? Certainly more than typing a post.

Please check out my "Coding talks" on InterSystems Developers YouTube:

1. Analysing InterSystems IRIS System Performance with Yape. Part 1: Installing Yape

https://www.youtube.com/embed/3KClL5zT6MY
[This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]

Running Yape in a container.

2. Yape Container SQLite iostat InterSystems

https://www.youtube.com/embed/cuMLSO9NQCM
[This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]

Extracting and plotting pButtons data including timeframes and iostat.

13 3
2 1.4K

This post provides useful links and an overview of best practice configuration for low latency storage IO by creating LVM Physical Extent (PE) stripes for database disks on InterSystems Data Platforms; InterSystems IRIS, Caché, and Ensemble.

5 4
1 4.4K

InterSystems Data Platform includes utilities and tools for system monitoring and alerting, however System Administrators new to solutions built on the InterSystems Data Platform (a.k.a Caché) need to know where to start and what to configure.

This guide shows the path to a minimum monitoring and alerting solution using references from online documentation and developer community posts to show you how to enable and configure the following;

  1. Caché Monitor: Scans the console log and sends emails alerts.

  2. System Monitor: Monitors system status and resources, generating notifications (alerts and warnings) based on fixed parameters and also tracks overall system health.

  3. Health Monitor: Samples key system and user-defined metrics and compares them to user-configurable parameters and established normal values, generating notifications when samples exceed applicable or learned thresholds.

  4. History Monitor: Maintains a historical database of performance and system usage metrics.

  5. pButtons: Operating system and Caché metrics collection scheduled daily.

Remember this guide is a minimum configuration, the included tools are flexible and extensible so more functionality is available when needed. This guide skips through the documentation to get you up and going. You will need to dive deeper into the documentation to get the most out of the monitoring tools, in the meantime, think of this as a set of cheat sheets to get up and running.

12 1
7 2K

A request came from a customer to estimate how long it would take to encrypt a database with cvencrypt utility.

This question is a little bit like how long is a piece of string — it depends. But its an interesting question. The answer primarily depends on the performance of CPU and storage on the target platform the customer is using, so the answer is more about coming up with a simple methodology that can be used to benchmark the CPU and storage while running cvencrypt.

6 0
1 946

I am often asked by customers, vendors or internal teams to explain CPU capacity planning for large production databases running on VMware vSphere.

In summary there are a few simple best practices to follow for sizing CPU for large production databases:

  • Plan for one vCPU per physical CPU core.
  • Consider NUMA and ideally size VMs to keep CPU and memory local to a NUMA node.
  • Right-size virtual machines. Add vCPUs only when needed.

Generally this leads to a couple of common questions:

5 7
0 6.1K

I am looking for experience of people running Veeam with Caché databases.

Tips/Tricks/General questions like; what Veeam features are you using, what your backup cycle looks like, where does your data end up, what recovery/integrity checks you do, what sort of compression/dedupe you get.

Also what questions _you_ have and what problems you might be trying to solve.

0 9
0 1.8K

Hi, this post was initially written for Caché. In June 2023, I finally updated it for IRIS. If you are revisiting the post since then, the only real change is substituting Caché for IRIS! I also updated the links for IRIS documentation and fixed a few typos and grammatical errors. Enjoy :)

19 24
5 10.3K

Hyper-Converged Infrastructure (HCI) solutions have been gaining traction for the last few years with the number of deployments now increasing rapidly. IT decision makers are considering HCI when scoping new deployments or hardware refreshes especially for applications already virtualised on VMware. Reasons for choosing HCI include; dealing with a single vendor, validated interoperability between all hardware and software components, high performance especially IO, simple scalability by addition of hosts, simplified deployment and simplified management.

9 7
1 3.4K