Modern data architectures utilize real-time data capture, transformation, movement, and loading solutions to build data lakes, analytical warehouses, and big data repositories. It enables the analysis of data from various sources without impacting the operations that use them. To achieve this, establishing a continuous, scalable, elastic, and robust data flow is essential. The most prevalent method for that is through the CDC (Change Data Capture) technique. CDC monitors for small data set production, automatically captures this data, and delivers it to one or more recipients, including analytical data repositories. The major benefit is the elimination of the D+1 delay in analysis, as data is detected at the source as soon as it is produced, and later is replicated to the destination.

This article will demonstrate the two most common data sources for CDC scenarios, both as a source and a destination. For the data source (origin), we will explore the CDC in SQL databases and CSV files. For the data destination, we will use a columnar database (a typical high-performance analytical database scenario) and a Kafka topic (a standard approach for streaming data to the cloud and/or to multiple real-time data consumers).

Overview

This article will provide a sample for the following interoperability scenario:

9 0
4 110

Window functions in InterSystems IRIS let you perform powerful analytics — like running totals, rankings, and moving averages — directly in SQL.
They operate over a "window" of rows related to the current row, without collapsing results like GROUP BY.
This means you can write cleaner, faster, and more maintainable queries — no loops, no joins, no temp tables.

In this article let's understand the mechanics of window functions by addressing some common data analisys tasks.

6 0
2 168
3 2
0 88

Prefer not to read? Check out the demo video I created:

https://www.youtube.com/embed/-OwOAHC5b3s
[This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]

2 3
1 188

The rise of Big Data projects, real-time self-service analytics, online query services, and social networks, among others, have enabled scenarios for massive and high-performance data queries. In response to this challenge, MPP (massively parallel processing database) technology was created, and it quickly established itself. Among the open-source MPP options, Presto (https://prestodb.io/) is the best-known option. It originated in Facebook and was utilized for data analytics, but later became open-sourced.

6 0
3 312

In today's data landscape, businesses encounter a number of different challenges. One of them is to do analytics on top of unified and harmonized data layer available to all the consumers. A layer that can deliver the same answers to the same questions irrelative to the dialect or tool being used.

11 2
1 507

The invention and popularization of Large Language Models (such as OpenAI's GPT-4) has launched a wave of innovative solutions that can leverage large volumes of unstructured data that was impractical or even impossible to process manually until recently.

27 4
6 811
Article
· Nov 27, 2023 4m read
What about DMN?

A few months ago, I faced a significant challenge: streamlining the handling of business logic in our application. My goal was to extract the business logic from the code and hand it over to analysts. Dealing with a multitude of rules could easily result in a code littered with countless "if" statements, especially if the coder lacked an understanding of cyclomatic complexity. Such code becomes a source of pain for those working with it—difficult to write, test, and develop.

3 1
0 367

Creating information dashboards, pivot tables, and widgets is an important step in analysis that provides valuable sources of information for informed decision-making. The IRIS BI platform offers many opportunities to create and customize these elements. In this article, we will take a closer look at the basic techniques for developing them and the importance of using them.

7 1
0 557

When analyzing data, there is often a need to look at specific indicators more thoroughly and to highlight sections of information of particular interest to a user.

For instance, examining the data dynamics for specific regions or dates can help us uncover some hidden trends and patterns that will allow us to make an informed decision about our project in the future.

2 0
1 453

As said in the previous article about the iris-fhir-generative-ai experiment, the project logs all events for analysis. Here we are going to discuss two types of analysis covered by analytics embedded in the project:

2 0
1 309
Article
· Jul 4, 2023 2m read
IntegratedMLandDashboardSample

A simple data analysis example created in IntegratedML and Dashboard

Based on InterSystems' Integrated ML technology and Dashboard, automatically generate relevant predictions and BI pages based on uploaded CSV files. The front and back ends are completed in Vue and Iris, allowing users to generate their desired data prediction and analysis pages with simple operations and make decisions based on them.

# ZPM installation

zpm:USER>install IntegratedMLandDashboardSample

# Process Deployment

3 2
0 259
Article
· Jun 13, 2023 2m read
OEX mapping #2

Technology Strategy

When I started this project I had set myself limits:
Though there is a wide range of almost ready-to-use modules in various languages
and though IRIS has excellent facilities and interfaces to make use of them
I decided to solve the challenge "totally internal" just with embedded Python, SQL, ObjectScript
Neither Java, nor Nodes, nor Angular, PEX, ... you name it.
The combination of embedded Python and SQL is preferred. ObjectScript is just my last chance.

2 0
0 225
Article
· Apr 19, 2023 2m read
Apache Superset now with IRIS

Apache Superset is a modern data exploration and data visualization platform. Superset can replace or augment proprietary business intelligence tools for many teams. Superset integrates well with a variety of data sources.

And now it is possible to use with InterSystems IRIS as well.

An online demo is available and it uses IRIS Cloud SQL as a data source.

7 4
0 1K

With the improvement of living standards, people pay more and more attention to physical health. And the healthy development of children has become more and more a topic of concern for parents. The child's physical development can be reflected from the child's height and weight. Therefore, it is of great significance to predict the height and weight in a timely manner.

1 0
1 236
Article
· Feb 13, 2023 4m read
When to use Columnar Storage

With InterSystems IRIS 2022.2, we introduced Columnar Storage as a new option for persisting your IRIS SQL tables that can boost your analytical queries by an order of magnitude. The capability is marked as experimental in 2022.2 and 2022.3, but will "graduate" to a fully supported production capability in the upcoming 2023.1 release.

The product documentation and this introductory video, already describe the differences between row storage, still the default on IRIS and used throughout our customer base, and columnar table storage and provide high-level guidance on choosing the appropriate storage layout for your use case. In this article, we'll elaborate on this subject and share some recommendations based on industry-practice modelling principles, internal testing, and feedback from Early Access Program participants.

15 2
3 737

IRIS BI

We offer you to embed business intelligence into your applications in order to give your users an opportunity to ask and answer sophisticated questions about their data. Typically, your application will include customizable dashboards that can provide insight into data from Business Intelligence models known as cubes.

In contrast with traditional BI systems that use static data warehouses, Business Intelligence keeps being constantly synchronized with the live transactional data.

5 2
1 598
Article
· Nov 15, 2022 6m read
How to develop in Logi

Today we will talk about InterSystems Reports. This is a BI system that provides you with tools to create static reports and export them to different file formats. We will see how it works using the DC Analytics public analytical sample as an example. In this article, we will examine how to familiarize yourself with the reports available in the repository, how to make a new report based on a ready-made data structure, and how to prepare a data structure from scratch.

3 0
0 811

Today we will talk about Adaptive Analytics. This is a system that allows you to receive data from various sources with a relativistic data structure and create OLAP cubes based on this data. This system also provides the ability to filter and aggregate data and has mechanisms to speed up the work of analytical queries.

6 0
1 763
Article
· Oct 24, 2022 10m read
PerfTools IO Test Suite

Purpose

This set of tools (RanRead, RanWrite, and the combined RanIO) is used to generate random read and write events within a database (or pair of databases) to test the IO speed of IRIS running on a specified hardware setup. While Read operations can be measured in the usual Input/Output operations per second (IOPS) since they're direct disk reads, write events are sent to the database and thus their physical writes are managed by IRIS's write daemon.

Results gathered from the IO tests will vary from configuration to configuration based on the IO sub-system. Before running these tests, ensure corresponding operating system and storage level monitoring are configured to capture IO performance metrics for later analysis. The suggested method is by running the System Performance tool that comes bundled within IRIS. Please note that this is an update to a previous release, which can be found here.

5 1
1 936