A short post for now to answer a question that came up. In post two of this series I included graphs of performance data extracted from pButtons. I was asked off-line if there is a quicker way than cut/paste to extract metrics for mgstat etc from a pButtons .html file for easy charting in Excel.

See: - Part 2 - Looking at the metrics we collected

7 2
0 1.5K

In the previous parts (1, 2) we talked about globals as trees. In this article, we will look at them as sparse arrays.

A sparse array - is a type of array where most values assume an identical value.

In practice, you will often see sparse arrays so huge that there is no point in occupying memory with identical elements. Therefore, it makes sense to organize sparse arrays in such a way that memory is not wasted on storing duplicate values.

In some programming languages, sparse arrays are part of the language - for example, in J, MATLAB. In other languages, there are special libraries that let you use them. For C++, those would be Eigen and the like.

Globals are good candidates for implementing sparse arrays for the following reasons:

8 3
1 1.5K

Continuing on with providing some examples of various storage technologies and their performance profiles, this time we looked at the growing trend of leveraging internal commodity-based server storage, specifically the new HPE Cloudline 3150 Gen10 AMD processor-based single socket servers with two 3.2TB Samsung PM1725a NVMe drives.

4 2
0 1.4K

Most transactional applications have a 70:30 RW profile. However, some special cases have extremely high write IO profiles.

I ran storage IO tests in the ap-southeast-2 (Sydney) AWS region to simulate IRIS database IO patterns and throughput similar to a very high write rate application.

The test aimed to determine whether the EC2 instance types and EBS volume types available in the AWS Australian regions will support the high IO rates and throughput required.

5 0
0 1.4K
Article
· Sep 30, 2016 1m read
ECP Magic

I saw someone recently refer to ECP as magic. It certainly seems so, and there is a lot of very clever engineering to make it work. But the following sequence of diagrams is a simple view of how data is retrieved and used across a distributed architecture.

For more more on ECP including capacity planning follow this link: Data Platforms and Performance - Part 7 ECP for performance, scalability and availability

10 0
0 1.3K

While reviewing our documentation for our ^pButtons (in IRIS renamed as ^SystemPerformance) performance monitoring utility, a customer told me: "I understand all of this, but I wish it could be simpler… easier to define profiles, manage them etc.".

After this session I thought it would be a nice exercise to try and provide some easier human interface for this.

The first step in this was to wrap a class-based API to the existing pButtons routine.

I was also able to add some more "features" like showing what profiles are currently running, their time remaining to run, previously running processes and more.

The next step was to add on top of this API, a REST API class.

With this artifact (a pButtons REST API) in hand, one can go ahead and build a modern UI on top of that.

For example -

6 15
4 1.2K
Article
· Jan 11, 2019 4m read
SQL Performance Resources

There are three things most important to any SQL performance conversation: Indices, TuneTable, and Show Plan. The attached PDFs includes historical presentations on these topics that cover the basics of these 3 things in one place. Our documentation provides more detail on these and other SQL Performance topics in the links below. The eLearning options reinforces several of these topics. In addition, there are several Developer Community articles which touch on SQL performance, and those relevant links are also listed.

There is a fair amount of repetition in the information listed below. The most important aspects of SQL performance to consider are:

  1. The types of indices available
  2. Using one index type over another
  3. The information TuneTable gathers for a table and what it means to the Optimizer
  4. How to read a Show Plan to better understand if a query is good or bad
13 3
8 1.1K
Article
· Jul 7, 2017 19m read
Indexing of non-atomic attributes

Quotes (1NF/2NF/3NF)ru:

Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else).
The same value can be atomic or non-atomic depending on the purpose of this value. For example, “4286” can be
  • atomic, if its denotes “a credit card’s PIN code” (if it’s broken down or reshuffled, it is of no use any longer)
  • non-atomic, if it’s just a “sequence of numbers” (the value still makes sense if broken down into several parts or reshuffled)

This article explores the standard methods of increasing the performance of SQL queries involving the following types of fields: string, date, simple list (in the $LB format), "list of <...>" and "array of <...>".

7 0
0 1.1K
Article
· Oct 1, 2018 4m read
Profiling code using Caché Monitor

Not everyone knows that InterSystems Caché has a built-in tool for code profiling called Caché Monitor.

Its main purpose (obviously) is the collection of statistics for programs running in Caché. It can provide statistics by program, as well as detailed Line-by-Line statistics for each program.

Using Caché Monitor

Let’s take a look at a potential use case for Caché Monitor and its key features. So, in order to start the profiler, you need to go to the terminal and switch to the namespace that you want to monitor, then launch the %SYS.MONLBL system routine:

3 1
7 1.1K
Article
· May 25, 2023 12m read
AWS Capacity planning review example

I am often asked to review customers' IRIS application performance data to understand if system resources are under or over-provisioned.

This recent example is interesting because it involves an application that has done a "lift and shift" migration of a large IRIS database application to the Cloud. AWS, in this case.

A key takeaway is that once you move to the Cloud, resources can be right-sized over time as needed. You do not have to buy and provision on-premises infrastructure for many years in the future that you expect to grow into.

Continuous monitoring is required. Your application transaction rate will change as your business changes, the application use or the application itself changes. This will change the system resource requirements. Planners should also consider seasonal peaks in activity. Of course, an advantage of the Cloud is resources can be scaled up or down as needed.

For more background information, there are several in-depth posts on AWS and IRIS in the community. A search for "AWS reference" is an excellent place to start. I have also added some helpful links at the end of this post.

AWS services are like Lego blocks, different sizes and shapes can be combined. I have ignored networking, security, and standing up a VPC for this post. I have focused on two of the Lego block components;
- Compute requirements.
- Storage requirements.

9 1
3 1.1K

A More Industrial-Looking Global Storage Scheme

In the first article in this series, we looked at the entity–attribute–value (EAV) model in relational databases, and took a look at the pros and cons of storing those entities, attributes and values in tables. We learned that, despite the benefits of this approach in terms of flexibility, there are some real disadvantages, in particular a basic mismatch between the logical structure of the data and its physical storage, which causes various difficulties.

2 0
0 903

It seems like yesterday when we did a small project in Java to test the performance of IRIS, PostgreSQL and MySQL (you can review the article we wrote back in June at the end of this article). If you remember, IRIS was superior to PostgreSQL and clearly superior to MySQL in insertions, with no big difference in queries.

8 6
3 870

YASPE is the successor to YAPE (Yet Another pButtons Extractor). YASPE has been written from the ground up with many internal changes to allow easier maintenance and enhancements.

YASPE functions:

  • Parse and chart InterSystems Caché pButtons and InterSystems IRIS SystemPerformance files for quick performance analysis of Operating System and IRIS metrics.
  • Allow a deeper dive by creating ad-hoc charts and by creating charts combining the Operating System and IRIS metrics with the "Pretty Performance" option.
  • The "System Overview" option saves you from searching your SystemPerformance files for system details or common configuration options.

YASPE is written in Python and is available on GitHub as source code or for Docker containers at:


13 1
5 750
Article
· Jul 26, 2017 3m read
What is APM?

What is APM?

I am talking about Application Performance Management at global summit, and several people have asked what that means so it is time for a bit of an explanation.

APM or Application Performance Management (sometimes referred to as Application Performance Monitoring) has a very good (if complicated) explanation on Wikipedia but to me it just means looking at performance from the users’ point of view and the level of service provided to them.

3 0
1 743

[Background]

InterSystems IRIS family has a nice utility ^SystemPerformance (as known as ^pButtons in Caché and Ensemble) which outputs the database performance information into a readable HTML file. When you run ^SystemPerformance on IRIS for Windows, a HTML file is created where both our own performance log mgstat and Windows performance log are included.

12 2
4 723

A customer recently asked if IRIS supported OpenTelemetry as they where seeking to measure the time that IRIS implemented SOAP Services take to complete. The customer already has several other technologies that support OpenTelemetry for process tracing. At this time, InterSystems IRIS (IRIS) do not natively support OpenTelemetry.

12 5
1 702

What is %SQLRESTRICT

%SQLRESTRICT is a special %FILTER clause for use in MDX queries in InterSystems IRIS Business Intelligence. Since this function begins with %, it means this is a special MDX extension created by InterSystems. It allows users to insert an SQL statement that will be used to restrict the returned records in the MDX Result Set. This SQL statement must return a set of Source Record IDs to limit the results by. Please see the documentation for more information.

Why is this useful?

This is useful because there are often times users want to restrict the results in their MDX Result Set based on information that is not in their cubes. It may be the case that this information may not make sense to be in the cube. Other times this can be useful when there is a large set of values you want to restrict. As mentioned before, this is not a standard MDX function, it was created by InterSystems to handle cases were queries were not performing well or cases that were not easily solved by existing functions.

6 0
2 683

A few years ago, I was teaching the basics of our %UnitTest framework during Caché Foundations class (now called Developing Using InterSystems Objects and SQL). A student asked if it was possible to collect performance statistics while running unit tests. A few weeks later, I added some additional code to the %UnitTest examples to answer this question. I’m finally sharing it on the Community.

5 2
2 682

Dynamic PoolSize (DPS) Experiment

Purpose:

Enhance Ensemble or IRIS production so it can dynamically allocate pool size for adapter-based components based on their utilization.

Sometimes, an unexpected traffic volume occurs, and default pool size allocated to production components may become a bottleneck. To avoid such situations, I created a demonstrator project some 2 years ago to see, whether it would be possible and feasible to modify production, so it allowed for dynamically modifying its components per their load.

5 3
0 549

Our team is reworking an application to use REST services that use the same database as our current ZEN application. One of the new REST endpoints uses a query that ran very slowly when first implemented. After some analysis, we found that an index on one of the fields in the table greatly improved performance (a query that took 35 seconds was now taking a fraction of a second).

3 4
0 501

When there's a performance issue, whether for all users on the system or a single process, the shortest path to understanding the root cause is usually to understand what the processes in question are spending their time doing. Are they mostly using CPU to dutifully march through their algorithm (for better or worse); or are they mostly reading database blocks from disk; or mostly waiting for something else, like LOCKs, ECP or database block collisions?

15 1
4 482

Like hardware hosts, virtual hosts in public and private clouds can develop resource bottlenecks as workloads increase. If you are using and managing InterSystems IRIS instances deployed in public or private clouds, you may have encountered a situation in which addressing performance or other issues requires increasing the capacity of an instance's host (that is, vertically scaling).

5 1
0 481

Windows Subsystem for Linux (WSL) is a feature of Windows that allows you to run a Linux environment on your Windows machine, without the need for a separate virtual machine or dual booting.

WSL is designed to provide a seamless and productive experience for developers who want to use both Windows and Linux at the same time**.

2 0
1 396