Embeddedpy-bridge: A Toolkit for Embedded Python

Overview

Embedded Python is a game-changer for InterSystems IRIS, offering access to the vast Python ecosystem directly within the database. However, bridging the gap between ObjectScript and Python can sometimes feel like translating between two different worlds.

2 1
2 86

YASPE is the successor to YAPE (Yet Another pButtons Extractor). YASPE has been written from the ground up with many internal changes to allow easier maintenance and enhancements.

YASPE functions:

  • Parse and chart InterSystems Caché pButtons and InterSystems IRIS SystemPerformance files for quick performance analysis of Operating System and IRIS metrics.
  • Allow a deeper dive by creating ad-hoc charts and by creating charts combining the Operating System and IRIS metrics with the "Pretty Performance" option.
  • The "System Overview" option saves you from searching your SystemPerformance files for system details or common configuration options.

YASPE is written in Python and is available on GitHub as source code or for Docker containers at:


13 1
5 877

High-Performance Message Searching in Health Connect

The Problem

Have you ever tried to do a search in Message Viewer on a busy interface and had the query time out? This can become quite a problem as the amount of data increases. For context, the instance of Health Connect I am working with does roughly 155 million Message Headers per day with 21 day message retention. To try and help with search performance, we extended the built-in SearchTable with commonly used fields in hopes that indexing these fields would result in faster query times. Despite this, we still couldn't get some of these queries to finish at all.

22 1
8 287
Article
· Jul 7, 2017 19m read
Indexing of non-atomic attributes

Quotes (1NF/2NF/3NF)ru:

Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else).
The same value can be atomic or non-atomic depending on the purpose of this value. For example, “4286” can be
  • atomic, if its denotes “a credit card’s PIN code” (if it’s broken down or reshuffled, it is of no use any longer)
  • non-atomic, if it’s just a “sequence of numbers” (the value still makes sense if broken down into several parts or reshuffled)

This article explores the standard methods of increasing the performance of SQL queries involving the following types of fields: string, date, simple list (in the $LB format), "list of <...>" and "array of <...>".

7 0
0 1.2K
Article
· Jul 26, 2017 3m read
What is APM?

What is APM?

I am talking about Application Performance Management at global summit, and several people have asked what that means so it is time for a bit of an explanation.

APM or Application Performance Management (sometimes referred to as Application Performance Monitoring) has a very good (if complicated) explanation on Wikipedia but to me it just means looking at performance from the users’ point of view and the level of service provided to them.

3 0
1 794

Introduction

Database performance has become a critical success factor in a modern application environment. Therefore identifying and optimizing the most resource-intensive SQL queries is essential for guaranteeing a smooth user experience and maintaining application stability.

This article will explore a quick approach to analyzing SQL query execution statistics on an InterSystems IRIS instance to identify areas for optimization within a macro-application.

Rather than focusing on real-time monitoring, we will set up a system that collects and analyzes statistics pre-calculated by IRIS once an hour. This approach, while not enabling instantaneous monitoring, offers an excellent compromise between the wealth of data available and the simplicity of implementation.

We will use Grafana for data visualization and analysis, InfluxDB for time series storage, and Telegraf for metrics collection. These tools, recognized for their power and flexibility, will allow us to obtain a clear and exploitable view.

More specifically, we will detail the configuration of Telegraf to retrieve statistics. We will also set up the integration with InfluxDB for data storage and analysis, and create customized dashboards in Grafana. This will help us quickly identify queries requiring special attention.

To facilitate the orchestration and deployment of these various components, we will employ Docker.

logos.png

6 0
4 365
Article
· Sep 30, 2016 1m read
ECP Magic

I saw someone recently refer to ECP as magic. It certainly seems so, and there is a lot of very clever engineering to make it work. But the following sequence of diagrams is a simple view of how data is retrieved and used across a distributed architecture.

For more more on ECP including capacity planning follow this link: Data Platforms and Performance - Part 7 ECP for performance, scalability and availability

10 0
0 1.3K

Most transactional applications have a 70:30 RW profile. However, some special cases have extremely high write IO profiles.

I ran storage IO tests in the ap-southeast-2 (Sydney) AWS region to simulate IRIS database IO patterns and throughput similar to a very high write rate application.

The test aimed to determine whether the EC2 instance types and EBS volume types available in the AWS Australian regions will support the high IO rates and throughput required.

5 0
0 1.7K

It has been noticed that some customers running JAVA programs (for example, FOP) on AIX would see the server eventually running low then out of memory. Customer would notice the system pages heavily and user experience becomes bad. And the server would crash when out of memory.

When the problem happens, we can see in ipcs a lot of shared memory segment marked for deletion (Capital D at the beginning of MODE section). This means they will not disappear until the last process attached to the segment detaches it.

5 0
2 1.9K

Windows Subsystem for Linux (WSL) is a feature of Windows that allows you to run a Linux environment on your Windows machine, without the need for a separate virtual machine or dual booting.

WSL is designed to provide a seamless and productive experience for developers who want to use both Windows and Linux at the same time**.

2 0
1 473

What is %SQLRESTRICT

%SQLRESTRICT is a special %FILTER clause for use in MDX queries in InterSystems IRIS Business Intelligence. Since this function begins with %, it means this is a special MDX extension created by InterSystems. It allows users to insert an SQL statement that will be used to restrict the returned records in the MDX Result Set. This SQL statement must return a set of Source Record IDs to limit the results by. Please see the documentation for more information.

Why is this useful?

This is useful because there are often times users want to restrict the results in their MDX Result Set based on information that is not in their cubes. It may be the case that this information may not make sense to be in the cube. Other times this can be useful when there is a large set of values you want to restrict. As mentioned before, this is not a standard MDX function, it was created by InterSystems to handle cases were queries were not performing well or cases that were not easily solved by existing functions.

6 0
2 729

Index

This is a list of all the posts in the Data Platforms’ capacity planning and performance series in order. Also a general list of my other posts. I will update as new posts in the series are added.


You will notice that I wrote some posts before IRIS was released and refer to Caché. I will revisit the posts over time, but in the meantime, Generally, the advice for configuration is the same for Caché and IRIS. Some command names may have changed; the most obvious example is that anywhere you see the ^pButtons command, you can replace it with ^SystemPerformance.


While some posts are updated to preserve links, others will be marked as strikethrough to indicate that the post is legacy. Generally, I will say, "See: some other post" if it is appropriate.


Capacity Planning and Performance Series

Generally, posts build on previous ones, but you can also just dive into subjects that look interesting.


16 0
7 6.6K

Often times support and sales engineers are asked about recent benchmark results on various platforms and large scale configurations. These will be made available here in the Developer Community in the "Documentation" section, and as an example here's a link to a recent Intel E7 v2 series processor benchmark.

https://community.intersystems.com/documentation/data-scalability-intersystems-caché-and-intel-processors-0

0 0
0 355

Often InterSystems technology architect team is asked about recommended storage arrays or storage technologies. To provide this information to a wider audience as reference, a new series is started to provide some of the results we have encountered with various storage technologies. As a general recommendation, all-flash storage is highly recommended with all InterSystems products to provide the lowest latency and predictable IOPS capabilities.

The first in the series was the most recently tested Netapp AFF A300 storage array. This is middle-tier type storage array with several higher models above it. This specific A300 model is capable of supporting a minimal configuration of only a few drives to hundreds of drives per HA pair, and also capable of being clustered with multiple controller pairs for tens of PB's of disk capacity and hundreds of thousands of IOPS or higher.

3 0
0 3.5K

Introduction

In the first article in this series, we’ll take a look at the entity–attribute–value (EAV) model in relational databases to see how it’s used and what it’s good for. Then we'll compare the EAV model concepts to globals.

3 0
4 4.4K
Article
· Sep 9, 2024 14m read
eBPF: Tracing Kernel Events for IRIS Workloads

I attended Cloud Native Security Con in Seattle with full intention of crushing OTEL day, then perusing the subject of security applied to Cloud Native workloads the following days leading up to CTF as a professional excercise. This was happily upended by a new understanding of eBPF, which got my screens, career, workloads, and atitude a much needed upgrade with new approaches to solving workload problems.

So I made it to the eBPF party and have been attending clinic after clinic on the subject ever since, here I would like to "unbox" eBPF as a technical solution, mapped directly to what we do in practice (even if its a bit off), and step through eBPF through my experimentation on supporting InterSystems IRIS Workloads, particularly on Kubernetes, but not necessarily void on standalone workloads.

eBee Steps with eBPF and InterSystems IRIS Workloads

6 0
3 365

A More Industrial-Looking Global Storage Scheme

In the first article in this series, we looked at the entity–attribute–value (EAV) model in relational databases, and took a look at the pros and cons of storing those entities, attributes and values in tables. We learned that, despite the benefits of this approach in terms of flexibility, there are some real disadvantages, in particular a basic mismatch between the logical structure of the data and its physical storage, which causes various difficulties.

2 0
0 970

So if you are following from the previous post or dropping in now, let's segway to the world of eBPF applications and take a look at Parca, which builds on our brief investigation of performance bottlenecks using eBPF, but puts a killer app on top of your cluster to monitor all your iris workloads, continually, cluster wide!

Continous Profiling with Parca, IRIS Workloads Cluster Wide

3 0
2 329

It's almost time to get your customers upgraded to new versions - are you worried about showing off your SQL Performance after upgrades? If you want to upgrade without worrying, then I have just the program for you!!! Check out this video from Global Summit 2016 featuring yours truly explaining how to upgrade a system without worrying about pesky SQL queries showing on your waistline!

https://www.youtube.com/watch?v=GfFPYfIoR_g

1 0
0 385

The rise of Big Data projects, real-time self-service analytics, online query services, and social networks, among others, have enabled scenarios for massive and high-performance data queries. In response to this challenge, MPP (massively parallel processing database) technology was created, and it quickly established itself. Among the open-source MPP options, Presto (https://prestodb.io/) is the best-known option. It originated in Facebook and was utilized for data analytics, but later became open-sourced.

6 0
3 347

Looking at my database I see I have a very big ^rINDEXSQL global? Why is that? 😬

In the Management Portal SQL page, under "SQL Statements" I see a 'Clean stale' button - what does this do? 🤔

In the list of Statements some have a 'Location' value and some don't? How is that? 🤨

5 0
1 73