Article
· May 25, 2016 5m read
Random Read IO Storage Performance Tool

New Tool Available

Please see PerfTools IO Test Suite for a later version of the Random Read IO tool.

Purpose

This tool is used to generate random read Input/Output (IO) from within the database. The goal of this tool is to drive as many jobs as possible to achieve target IOPS and ensure acceptable disk response times are sustained. Results gathered from the IO tests will vary from configuration to configuration based on the IO sub-system. Before running these tests ensure corresponding operating system and storage level monitoring are configured to capture IO performance metrics for later analysis.

14 17
3 3.9K

While reviewing our documentation for our ^pButtons (in IRIS renamed as ^SystemPerformance) performance monitoring utility, a customer told me: "I understand all of this, but I wish it could be simpler… easier to define profiles, manage them etc.".

After this session I thought it would be a nice exercise to try and provide some easier human interface for this.

The first step in this was to wrap a class-based API to the existing pButtons routine.

I was also able to add some more "features" like showing what profiles are currently running, their time remaining to run, previously running processes and more.

The next step was to add on top of this API, a REST API class.

With this artifact (a pButtons REST API) in hand, one can go ahead and build a modern UI on top of that.

For example -

6 15
4 1.3K

Your application is deployed and everything is running fine. Great, hi-five! Then out of the blue the phone starts to ring off the hook – it’s users complaining that the application is sometimes ‘slow’. But what does that mean? Sometimes? What tools do you have and what statistics should you be looking at to find and resolve this slowness? Is your system infrastructure up to the task of the user load? What infrastructure design questions should you have asked before you went into production? How can you capacity plan for new hardware with confidence and without over-spec'ing? How can you stop the phone ringing? How could you have stopped it ringing in the first place?

24 13
6 4.8K

Motivation

I didn't know about ObjectScript until I started my new job. Objectscript isn't actually a young programming language. Compared to C++, Java and Python, the community isn't as active, but we're keen to make this place more vibrant, aren't we?

I've noticed that some of my colleagues are finding it tricky to get their heads around the class relationships in these huge projects. There aren't any easy-to-use modern class diagram tool for ObjectScript.

Related Work

I have tried relavant works:

15 12
5 619

Hello!!!

Data migration often sounds like a simple "move data from A to B task" until you actually do it. In reality, it is a complex process that blends planning, validation, testing, and technical precision.

Over several projects where I handled data migration into a HIS which runs on IRIS (TrakCare), I realized that success comes from a mix of discipline and automation.

Here are a few points which I want to highlight.

1. Start with a Defined Data Format.

Before you even open your first file, make sure everyone, especially data providers, clearly understands the exact data format you expect. Defining templates early avoids unnecessary bank-and-forth and rework later.

While Excel or CSV formats are common, I personally feel using a tab-delimited text file (.txt) for data upload is best. It's lightweight, consistent, and avoids issues with commas inside text fields.

PatID   DOB Gender  AdmDate
10001   2000-01-02  M   2025-10-01
10002   1998-01-05  F   2025-10-05
10005   1980-08-23  M   2025-10-15

Make sure that the date formats given in the file is correct and constant throughout the file because all these files are usually converted from an Excel file and an Basic excel user might make mistakes while giving you the date formats wrong. Wrong date formats can irritate you while converting into horolog.

4 12
2 207

Globals, these magic swords for storing data, have been around for a while, but not many people can use them efficiently or know about this super-weapon altogether.

If you use globals for tasks where they truly shine, the results may be amazing, either in terms of increased performance or dramatic simplification of the overall solution (1, 2).

Globals offer a special way of storing and processing data, which is completely different from SQL tables. They were first introduced in 1966 in the M(UMPS) programming language, which was initially used in medical databases. It is still used in the same way, but has also been adopted by some other industries where reliability and high performance are top priorities: finance, trading, etc.

Later M(UMPS) evolved into Caché ObjectScript (COS). COS was developed by InterSystems as a superset of M. The original language is still accepted by developers' community and alive in a few implementations. There are several signs of activity around the web: MUMPS Google group, Mumps User's group), effective ISO Standard, etc.

Modern global based DBMS supports transactions, journaling, replication, partitioning. It means that they can be used for building modern, reliable and fast distributed systems.

Globals do not restrict you to the boundaries of the relational model. They give you the freedom of creating data structures optimized for particular tasks. For many applications reasonable use of globals can be a real silver bullet offering speeds that developers of conventional relational applications can only dream of.

Globals as a method of storing data can be used in many modern programming languages, both high- and low-level. Therefore, this article will focus specifically on globals and not the language they once came from.

14 10
0 2.6K
Article
· Apr 9, 2019 3m read
IRIS/Ensemble as an ETL

IRIS and Ensemble are designed to act as an ESB/EAI. This mean they are build to process lots of small messages.

But some times, in real life we have to use them as ETL. The down side is not that they can't do so, but it can take a long time to process millions of row at once.

To improve performance, I have created a new SQLOutboundAdaptor who only works with JDBC.

BatchSqlOutboundAdapter

Extend EnsLib.SQL.OutboundAdapter to add batch batch and fetch support on JDBC connection.

4 10
3 1.9K

In the last post we scheduled 24-hour collections of performance metrics using pButtons. In this post we are going to be looking at a few of the key metrics that are being collected and how they relate to the underlying system hardware. We will also start to explore the relationship between Caché (or any of the InterSystems Data Platforms) metrics and system metrics. And how you can use these metrics to understand the daily beat rate of your systems and diagnose performance problems.

20 10
2 4.3K

This week I am going to look at CPU, one of the primary hardware food groups :) A customer asked me to advise on the following scenario; Their production servers are approaching end of life and its time for a hardware refresh. They are also thinking of consolidating servers by virtualising and want to right-size capacity either bare-metal or virtualized. Today we will look at CPU, in later posts I will explain the approach for right-sizing other key food groups - memory and IO.

So the questions are:

15 10
2 5.3K

Myself and the other Technology Architects often have to explain to customers and vendors Caché IO requirements and the way that Caché applications will use storage systems. The following tables are useful when explaining typical Caché IO profile and requirements for a transactional database application with customers and vendors. The original tables were created by Mark Bolinsky.

In future posts I will be discussing more about storage IO so am also posting these tables now as a reference for those articles.

10 7
2 3.1K

Hyper-Converged Infrastructure (HCI) solutions have been gaining traction for the last few years with the number of deployments now increasing rapidly. IT decision makers are considering HCI when scoping new deployments or hardware refreshes especially for applications already virtualised on VMware. Reasons for choosing HCI include; dealing with a single vendor, validated interoperability between all hardware and software components, high performance especially IO, simple scalability by addition of hosts, simplified deployment and simplified management.

11 7
1 3.8K

One of the great availability and scaling features of Caché is Enterprise Cache Protocol (ECP). With consideration during application development distributed processing using ECP allows a scale out architecture for Caché applications. Application processing can scale to very high rates from a single application server to the processing power of up to 255 application servers with no application changes.

10 6
2 3.6K
Article
· Jun 19, 2025 10m read
Towards Smarter Table Statistics

This article describes a significant enhancement of how InterSystems IRIS deals with table statistics, a crucial element for IRIS SQL processing, in the 2025.2 release. We'll start with a brief refresher on what table statistics are, how they are used, and why we needed this enhancement. Then, we'll dive into the details of the new infrastructure for collecting and saving table statistics, after which we'll zoom in onto what the change means in practice for your applications. We'll end with a few additional notes on patterns enabled by the new model, and look forward to the follow-on phases of this initial delivery.

14 6
4 296

It seems like yesterday when we did a small project in Java to test the performance of IRIS, PostgreSQL and MySQL (you can review the article we wrote back in June at the end of this article). If you remember, IRIS was superior to PostgreSQL and clearly superior to MySQL in insertions, with no big difference in queries.

8 6
3 983
Article
· Oct 22, 2025 2m read
Tips on handling Large data

Hello community,

I wanted to share my experience about working on Large Data projects. Over the years, I have had the opportunity to handle massive patient data, payor data and transactional logs while working in an hospital industry. I have had the chance to build huge reports which had to be written using advanced logics fetching data across multiple tables whose indexing was not helping me write efficient code.

Here is what I have learned about managing large data efficiently.

Choosing the right data access method.

As we all here in the community are aware of, IRIS provides multiple ways to access data. Choosing the right method, depends on the requirement.

  • Direct Global Access: Fastest for bulk read/write operations. For example, if i have to traverse through indexes and fetch patient data, I can loop through the globals to process millions of records. This will save a lot of time.
Set ToDate=+H
Set FromDate=+$H-1 For  Set FromDate=$O(^PatientD("Date",FromDate)) Quit:FromDate>ToDate  Do
. Set PatId="" For  Set PatId=$Order(^PatientD("Date",FromDate,PatID)) Quit:PatId=""  Do
. . Write $Get(^PatientD("Date",FromDate,PatID)),!
  • Using SQL: Useful for reporting or analytical requirements, though slower for huge data sets.

3 6
1 160

It's been a long time since I didn't write an update post on IoP.

image

So what's new since IoP command line interface was released?

Two new big features were added to IoP:
- Rebranding: the grongier.pex module was renamed to iop to reflect the new name of the project.
- Async support: IoP now supports async functions and coroutines.

3 5
0 330

InterSystems IRIS offers various ways how to profile your code, in most cases it produces enough information to find the places where the most time is spent or where the most global sets. But sometimes it's difficult to understand the execution flow and how it ended at that point.

To solve this, I've decided to implement a way to build a report in a way, so, it's possible to dive by stack down

3 4
2 449

Our team is reworking an application to use REST services that use the same database as our current ZEN application. One of the new REST endpoints uses a query that ran very slowly when first implemented. After some analysis, we found that an index on one of the fields in the table greatly improved performance (a query that took 35 seconds was now taking a fraction of a second).

3 4
0 557

APM normally focuses on the activity of the application but gathering information about system usage gives you important background information that helps understand and manage the performance of your application so I am including the IRIS History Monitor in this series.

In this article I will briefly describe how you start the IRIS or Caché History Monitor to build a record of the system level activity to go with the application activity and performance information you gather. I will also give examples of SQL to access the information.

6 4
3 1.9K

This post will guide you through the process of sizing shared memory requirements for database applications running on InterSystems data platforms. It will cover key aspects such as global and routine buffers, gmheap, and locksize, providing you with a comprehensive understanding. Additionally, it will offer performance tips for configuring servers and virtualizing IRIS applications. Please note that when I refer to IRIS, I include all the data platforms (Ensemble, HealthShare, iKnow, Caché, and IRIS).

32 3
9 11.2K

In the previous parts (1, 2) we talked about globals as trees. In this article, we will look at them as sparse arrays.

A sparse array - is a type of array where most values assume an identical value.

In practice, you will often see sparse arrays so huge that there is no point in occupying memory with identical elements. Therefore, it makes sense to organize sparse arrays in such a way that memory is not wasted on storing duplicate values.

In some programming languages, sparse arrays are part of the language - for example, in J, MATLAB. In other languages, there are special libraries that let you use them. For C++, those would be Eigen and the like.

Globals are good candidates for implementing sparse arrays for the following reasons:

8 3
1 1.6K