InterSystems IRIS for Health™ is the world’s first and only data platform engineered specifically for the rapid development of healthcare applications to manage the world’s most critical data. It includes powerful out-of-the-box features: transaction processing and analytics, an extensible healthcare data model, FHIR-based solution development, support for healthcare interoperability standards, and more. All enabling developers to realize value and build breakthrough applications, fast. Learn more.
Your application is deployed and everything is running fine. Great, hi-five! Then out of the blue the phone starts to ring off the hook – it’s users complaining that the application is sometimes ‘slow’. But what does that mean? Sometimes? What tools do you have and what statistics should you be looking at to find and resolve this slowness? Is your system infrastructure up to the task of the user load? What infrastructure design questions should you have asked before you went into production? How can you capacity plan for new hardware with confidence and without over-spec'ing? How can you stop the phone ringing? How could you have stopped it ringing in the first place?
This week I am going to look at CPU, one of the primary hardware food groups :) A customer asked me to advise on the following scenario; Their production servers are approaching end of life and its time for a hardware refresh. They are also thinking of consolidating servers by virtualising and want to right-size capacity either bare-metal or virtualized. Today we will look at CPU, in later posts I will explain the approach for right-sizing other key food groups - memory and IO.
In the last post we scheduled 24-hour collections of performance metrics using pButtons. In this post we are going to be looking at a few of the key metrics that are being collected and how they relate to the underlying system hardware. We will also start to explore the relationship between Caché (or any of the InterSystems Data Platforms) metrics and system metrics. And how you can use these metrics to understand the daily beat rate of your systems and diagnose performance problems.
This post will show you an approach to size shared memory requirements for database applications running on InterSystems data platforms including global and routine buffers, gmheap, and locksize as well as some performance tips you should consider when configuring servers and when virtualizing Caché applications. As ever when I talk about Caché I mean all the data platform (Ensemble, HealthShare, iKnow and Caché).
While this article is about InterSystems IRIS, it also applies to Caché, Ensemble, and HealthShare distributions.
Memory is managed in pages. The default page size is 4KB on Linux systems. Red Hat Enterprise Linux 6, SUSE Linux Enterprise Server 11, and Oracle Linux 6 introduced a method to provide an increased page size in 2MB or 1GB sizes depending on system configuration know as HugePages.
At first HugePages required to be assigned at boot time, and if not managed or calculated appropriately could result in wasted resources. As a result various Linux distributions introduced Transparent HugePages with the 2.6.38 kernel as enabled by default. This was meant as a means to automate creating, managing, and using HugePages. Prior kernel versions may have this feature as well however may not be marked as [always] and potentially set to [madvise].
Transparent Huge Pages (THP) is a Linux memory management system that reduces the overhead of Translation Lookaside Buffer (TLB) lookups on machines with large amounts of memory by using larger memory pages. However in current Linux releases THP can only map individual process heap and stack space.
One of the great availability and scaling features of Caché is Enterprise Cache Protocol (ECP). With consideration during application development distributed processing using ECP allows a scale out architecture for Caché applications. Application processing can scale to very high rates from a single application server to the processing power of up to 255 application servers with no application changes.
Myself and the other Technology Architects often have to explain to customers and vendors Caché IO requirements and the way that Caché applications will use storage systems. The following tables are useful when explaining typical Caché IO profile and requirements for a transactional database application with customers and vendors. The original tables were created by Mark Bolinsky.
In future posts I will be discussing more about storage IO so am also posting these tables now as a reference for those articles.
This is a list of all the posts in the data platforms capacity planning and performance series in order. Also a general list of my other posts. I will update as new posts in the series are added.
You will notice that I wrote some of the posts before IRIS was released and refer to Caché. I will revisit the posts over time, but in the meantime; Generally, the advice for configuration is the same for Caché and IRIS. Some command names may have changed; the most obvious example is that anywhere you see ^pButtons command, you can replace it with ^SystemPerformance.
Capacity Planning and Performance Series
Generally posts build on previous, but you can also just dive in to subjects that look interesting.
This article provides a reference architecture as a sample for providing robust performing and highly available applications based on InterSystems Technologies that are applicable to Caché, Ensemble, HealthShare, TrakCare, and associated embedded technologies such as DeepSee, iKnow, Zen and Zen Mojo.
Azure has two different deployment models for creating and working with resources: Azure Classic and Azure Resource Manager. The information detailed in this article is based on the Azure Resource Manager model (ARM).