This post will show you an approach to size shared memory requirements for database applications running on InterSystems data platforms including global and routine buffers, gmheap, and locksize as well as some performance tips you should consider when configuring servers and when virtualizing Caché applications. As ever when I talk about Caché I mean all the data platform (Ensemble, HealthShare, iKnow and Caché).


A list of other posts in this series is here

29 3
7 9.9K

Your application is deployed and everything is running fine. Great, hi-five! Then out of the blue the phone starts to ring off the hook – it’s users complaining that the application is sometimes ‘slow’. But what does that mean? Sometimes? What tools do you have and what statistics should you be looking at to find and resolve this slowness? Is your system infrastructure up to the task of the user load? What infrastructure design questions should you have asked before you went into production? How can you capacity plan for new hardware with confidence and without over-spec'ing? How can you stop the phone ringing? How could you have stopped it ringing in the first place?

23 13
5 4.4K

In the last post we scheduled 24-hour collections of performance metrics using pButtons. In this post we are going to be looking at a few of the key metrics that are being collected and how they relate to the underlying system hardware. We will also start to explore the relationship between Caché (or any of the InterSystems Data Platforms) metrics and system metrics. And how you can use these metrics to understand the daily beat rate of your systems and diagnose performance problems.

19 10
2 3.8K

There are often questions surrounding the ideal Apache HTTPD Web Server configuration for HealthShare. The contents of this article will outline the initial recommended web server configuration for any HealthShare product.

As a starting point, Apache HTTPD version 2.4.x (64-bit) is recommended. Earlier versions such as 2.2.x are available, however version 2.2 is not recommended for performance and scalability of HealthShare.

18 0
15 10.7K

The following steps show you how to display a sample list of metrics available from the /api/monitor service.

In the last post, I gave an overview of the service that exposes IRIS metrics in Prometheus format. The post shows how to set up and run IRIS preview release 2019.4 in a container and then list the metrics.


This post assumes you have Docker installed. If not, go and do that now for your platform :)

17 10
5 1.2K

While the integrity of Caché and InterSystems IRIS databases is completely protected from the consequences of system failure, physical storage devices do fail in ways that corrupt the data they store. For that reason, many sites choose to run regular database integrity checks, particularly in coordination with backups to validate that a given backup could be relied upon in a disaster.

16 8
9 1.8K

Globals, these magic swords for storing data, have been around for a while, but not many people can use them efficiently or know about this super-weapon altogether.

If you use globals for tasks where they truly shine, the results may be amazing, either in terms of increased performance or dramatic simplification of the overall solution (1, 2).

Globals offer a special way of storing and processing data, which is completely different from SQL tables. They were first introduced in 1966 in the M(UMPS) programming language, which was initially used in medical databases. It is still used in the same way, but has also been adopted by some other industries where reliability and high performance are top priorities: finance, trading, etc.

Later M(UMPS) evolved into Caché ObjectScript (COS). COS was developed by InterSystems as a superset of M. The original language is still accepted by developers' community and alive in a few implementations. There are several signs of activity around the web: MUMPS Google group, Mumps User's group), effective ISO Standard, etc.

Modern global based DBMS supports transactions, journaling, replication, partitioning. It means that they can be used for building modern, reliable and fast distributed systems.

Globals do not restrict you to the boundaries of the relational model. They give you the freedom of creating data structures optimized for particular tasks. For many applications reasonable use of globals can be a real silver bullet offering speeds that developers of conventional relational applications can only dream of.

Globals as a method of storing data can be used in many modern programming languages, both high- and low-level. Therefore, this article will focus specifically on globals and not the language they once came from.

14 10
0 2.3K

This week I am going to look at CPU, one of the primary hardware food groups :) A customer asked me to advise on the following scenario; Their production servers are approaching end of life and its time for a hardware refresh. They are also thinking of consolidating servers by virtualising and want to right-size capacity either bare-metal or virtualized. Today we will look at CPU, in later posts I will explain the approach for right-sizing other key food groups - memory and IO.

So the questions are:

14 10
2 4.8K
Article
· May 25, 2016 5m read
Random Read IO Storage Performance Tool

New Tool Available

Please see PerfTools IO Test Suite for a later version of the Random Read IO tool.

Purpose

This tool is used to generate random read Input/Output (IO) from within the database. The goal of this tool is to drive as many jobs as possible to achieve target IOPS and ensure acceptable disk response times are sustained. Results gathered from the IO tests will vary from configuration to configuration based on the IO sub-system. Before running these tests ensure corresponding operating system and storage level monitoring are configured to capture IO performance metrics for later analysis.

14 17
2 3.4K

If a picture is worth a thousand words, what's a video worth? Certainly more than typing a post.

Please check out my "Coding talks" on InterSystems Developers YouTube:

1. Analysing InterSystems IRIS System Performance with Yape. Part 1: Installing Yape

https://www.youtube.com/embed/3KClL5zT6MY
[This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]

Running Yape in a container.

2. Yape Container SQLite iostat InterSystems

https://www.youtube.com/embed/cuMLSO9NQCM
[This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]

Extracting and plotting pButtons data including timeframes and iostat.

13 3
2 1.4K

When there's a performance issue, whether for all users on the system or a single process, the shortest path to understanding the root cause is usually to understand what the processes in question are spending their time doing. Are they mostly using CPU to dutifully march through their algorithm (for better or worse); or are they mostly reading database blocks from disk; or mostly waiting for something else, like LOCKs, ECP or database block collisions?

13 0
2 228

YASPE is the successor to YAPE (Yet Another pButtons Extractor). YASPE has been written from the ground up with many internal changes to allow easier maintenance and enhancements.

YASPE functions:

  • Parse and chart InterSystems Caché pButtons and InterSystems IRIS SystemPerformance files for quick performance analysis of Operating System and IRIS metrics.
  • Allow a deeper dive by creating ad-hoc charts and by creating charts combining the Operating System and IRIS metrics with the "Pretty Performance" option.
  • The "System Overview" option saves you from searching your SystemPerformance files for system details or common configuration options.

YASPE is written in Python and is available on GitHub as source code or for Docker containers at:


12 1
4 424

[Background]

InterSystems IRIS family has a nice utility ^SystemPerformance (as known as ^pButtons in Caché and Ensemble) which outputs the database performance information into a readable HTML file. When you run ^SystemPerformance on IRIS for Windows, a HTML file is created where both our own performance log mgstat and Windows performance log are included.

11 1
2 382

A customer recently asked if IRIS supported OpenTelemetry as they where seeking to measure the time that IRIS implemented SOAP Services take to complete. The customer already has several other technologies that support OpenTelemetry for process tracing. At this time, InterSystems IRIS (IRIS) do not natively support OpenTelemetry.

11 4
1 237

Myself and the other Technology Architects often have to explain to customers and vendors Caché IO requirements and the way that Caché applications will use storage systems. The following tables are useful when explaining typical Caché IO profile and requirements for a transactional database application with customers and vendors. The original tables were created by Mark Bolinsky.

In future posts I will be discussing more about storage IO so am also posting these tables now as a reference for those articles.

9 7
2 2.7K
Article
· May 25, 2023 12m read
AWS Capacity planning review example

I am often asked to review customers' IRIS application performance data to understand if system resources are under or over-provisioned.

This recent example is interesting because it involves an application that has done a "lift and shift" migration of a large IRIS database application to the Cloud. AWS, in this case.

A key takeaway is that once you move to the Cloud, resources can be right-sized over time as needed. You do not have to buy and provision on-premises infrastructure for many years in the future that you expect to grow into.

Continuous monitoring is required. Your application transaction rate will change as your business changes, the application use or the application itself changes. This will change the system resource requirements. Planners should also consider seasonal peaks in activity. Of course, an advantage of the Cloud is resources can be scaled up or down as needed.

For more background information, there are several in-depth posts on AWS and IRIS in the community. A search for "AWS reference" is an excellent place to start. I have also added some helpful links at the end of this post.

AWS services are like Lego blocks, different sizes and shapes can be combined. I have ignored networking, security, and standing up a VPC for this post. I have focused on two of the Lego block components;
- Compute requirements.
- Storage requirements.

9 1
3 567

Hyper-Converged Infrastructure (HCI) solutions have been gaining traction for the last few years with the number of deployments now increasing rapidly. IT decision makers are considering HCI when scoping new deployments or hardware refreshes especially for applications already virtualised on VMware. Reasons for choosing HCI include; dealing with a single vendor, validated interoperability between all hardware and software components, high performance especially IO, simple scalability by addition of hosts, simplified deployment and simplified management.

9 7
1 3.4K

In the previous parts (1, 2) we talked about globals as trees. In this article, we will look at them as sparse arrays.

A sparse array - is a type of array where most values assume an identical value.

In practice, you will often see sparse arrays so huge that there is no point in occupying memory with identical elements. Therefore, it makes sense to organize sparse arrays in such a way that memory is not wasted on storing duplicate values.

In some programming languages, sparse arrays are part of the language - for example, in J, MATLAB. In other languages, there are special libraries that let you use them. For C++, those would be Eigen and the like.

Globals are good candidates for implementing sparse arrays for the following reasons:

8 3
1 1.3K
Article
· Jan 11, 2019 4m read
SQL Performance Resources

There are three things most important to any SQL performance conversation: Indices, TuneTable, and Show Plan. The attached PDFs includes historical presentations on these topics that cover the basics of these 3 things in one place. Our documentation provides more detail on these and other SQL Performance topics in the links below. The eLearning options reinforces several of these topics. In addition, there are several Developer Community articles which touch on SQL performance, and those relevant links are also listed.

There is a fair amount of repetition in the information listed below. The most important aspects of SQL performance to consider are:

  1. The types of indices available
  2. Using one index type over another
  3. The information TuneTable gathers for a table and what it means to the Optimizer
  4. How to read a Show Plan to better understand if a query is good or bad
8 1
4 905
Article
· Jul 7, 2017 19m read
Indexing of non-atomic attributes

Quotes (1NF/2NF/3NF)ru:

Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else).
The same value can be atomic or non-atomic depending on the purpose of this value. For example, “4286” can be
  • atomic, if its denotes “a credit card’s PIN code” (if it’s broken down or reshuffled, it is of no use any longer)
  • non-atomic, if it’s just a “sequence of numbers” (the value still makes sense if broken down into several parts or reshuffled)

This article explores the standard methods of increasing the performance of SQL queries involving the following types of fields: string, date, simple list (in the $LB format), "list of <...>" and "array of <...>".

7 0
0 970

A short post for now to answer a question that came up. In post two of this series I included graphs of performance data extracted from pButtons. I was asked off-line if there is a quicker way than cut/paste to extract metrics for mgstat etc from a pButtons .html file for easy charting in Excel.

See: - Part 2 - Looking at the metrics we collected

7 2
0 1.4K

It seems like yesterday when we did a small project in Java to test the performance of IRIS, PostgreSQL and MySQL (you can review the article we wrote back in June at the end of this article). If you remember, IRIS was superior to PostgreSQL and clearly superior to MySQL in insertions, with no big difference in queries.

7 6
3 424

What is %SQLRESTRICT

%SQLRESTRICT is a special %FILTER clause for use in MDX queries in InterSystems IRIS Business Intelligence. Since this function begins with %, it means this is a special MDX extension created by InterSystems. It allows users to insert an SQL statement that will be used to restrict the returned records in the MDX Result Set. This SQL statement must return a set of Source Record IDs to limit the results by. Please see the documentation for more information.

Why is this useful?

This is useful because there are often times users want to restrict the results in their MDX Result Set based on information that is not in their cubes. It may be the case that this information may not make sense to be in the cube. Other times this can be useful when there is a large set of values you want to restrict. As mentioned before, this is not a standard MDX function, it was created by InterSystems to handle cases were queries were not performing well or cases that were not easily solved by existing functions.

6 0
1 537