Article
· Aug 2, 2022 8m read
Data models in InterSystems IRIS

Before we start talking about databases and different data models that exist, first we'd better talk about what a database is and how to use it.

A database is an organized collection of data stored and accessed electronically. It is used to store and retrieve structured, semi-structured, or raw data which is often related to a theme or activity.

At the heart of every database lies at least one model used to describe its data. And depending on the model it is based on, a database may have slightly different characteristics and store different types of data.

To write, retrieve, modify, sort, transform or print the information from the database, a software called Database Management System (DBMS) is used.

The size, capacity, and performance of databases and their respective DBMS have increased by several orders of magnitude. It has been made possible by technological advances in various areas, such as processors, computer memory, computer storage, and computer networks. In general, the development of database technology can be divided into four generations based on the data models or structure: navigational, relational, object and post-relational.

15 5
3 1.4K

InterSystems IRIS currently limits classes to 999 properties.

But what to do if you need to store more data per object?

This article would answer this question (with the additional cameo of Community Python Gateway and how you can transfer wide datasets into Python).

The answer is very simple actually - InterSystems IRIS currently limits classes to 999 properties, but not to 999 primitives. The property in InterSystems IRIS can be an object with 999 properties and so on - the limit can be easily disregarded.

5 13
1 587

This is the second piece in our series on 2021.2 SQL enhancements delivering an adaptive, high-performance SQL experience. In this article, we'll zoom in on the innovations in gathering Table Statistics, which are of course the primary input for the Run Time Plan Choice capability we described in the previous article.

10 4
0 679

Why I've decided to write this

Once again I had a challenge that costed me some time and a lot of testing to reach the best solution. And now that I've managed to solve it, I'd like to share a little bit of my knowledge.

What happened?

In a namespace there were a lot of similar classes, so to make them simpler there were a superclass with comon properties. Also, there are relationships between them. I had to export one of them to JSON, but I couldn't change the superclasses, or I would break down the flow of many other integrations.

4 0
1 286

The 2021.2 release of the InterSystems IRIS Data Platform includes many exciting new features for fast, flexible and secure development of your mission-critical applications. Embedded Python definitely takes the limelight (and for good reason!), but in SQL we've also made a massive step forward towards a more adaptive engine that gathers detailed statistical information about your table data and exploits it to deliver the best query plans. In this brief series of articles, we'll take a closer at three elements that are new in 2021.2 and work together towards this goal, starting with Run Time Plan Choice.

It's hard to figure out the right order to talk about these (you can't imagine how often I've reshuffled them in writing this article!) because they fit together in such a nice way. As such, feel free to go on a limb and read these in random order smiley.

13 2
1 653

This is the third article in our short series around innovations in IRIS SQL that deliver a more adaptive, high-performance experience for analysts and applications querying relational data on IRIS. It may be the last article in this series for 2021.2, but we have several more enhancements lined up in this area. In this article, we'll dig a little deeper into additional table statistics we're starting to gather in this release: Histograms

9 0
0 426

A More Industrial-Looking Global Storage Scheme

In the first article in this series, we looked at the entity–attribute–value (EAV) model in relational databases, and took a look at the pros and cons of storing those entities, attributes and values in tables. We learned that, despite the benefits of this approach in terms of flexibility, there are some real disadvantages, in particular a basic mismatch between the logical structure of the data and its physical storage, which causes various difficulties.

2 0
0 698

Introduction

In the first article in this series, we’ll take a look at the entity–attribute–value (EAV) model in relational databases to see how it’s used and what it’s good for. Then we'll compare the EAV model concepts to globals.

3 0
4 3.4K

In the previous parts (1, 2) we talked about globals as trees. In this article, we will look at them as sparse arrays.

A sparse array - is a type of array where most values assume an identical value.

In practice, you will often see sparse arrays so huge that there is no point in occupying memory with identical elements. Therefore, it makes sense to organize sparse arrays in such a way that memory is not wasted on storing duplicate values.

In some programming languages, sparse arrays are part of the language - for example, in J, MATLAB. In other languages, there are special libraries that let you use them. For C++, those would be Eigen and the like.

Globals are good candidates for implementing sparse arrays for the following reasons:

8 3
1 1.3K

Globals, these magic swords for storing data, have been around for a while, but not many people can use them efficiently or know about this super-weapon altogether.

If you use globals for tasks where they truly shine, the results may be amazing, either in terms of increased performance or dramatic simplification of the overall solution (1, 2).

Globals offer a special way of storing and processing data, which is completely different from SQL tables. They were first introduced in 1966 in the M(UMPS) programming language, which was initially used in medical databases. It is still used in the same way, but has also been adopted by some other industries where reliability and high performance are top priorities: finance, trading, etc.

Later M(UMPS) evolved into Caché ObjectScript (COS). COS was developed by InterSystems as a superset of M. The original language is still accepted by developers' community and alive in a few implementations. There are several signs of activity around the web: MUMPS Google group, Mumps User's group), effective ISO Standard, etc.

Modern global based DBMS supports transactions, journaling, replication, partitioning. It means that they can be used for building modern, reliable and fast distributed systems.

Globals do not restrict you to the boundaries of the relational model. They give you the freedom of creating data structures optimized for particular tasks. For many applications reasonable use of globals can be a real silver bullet offering speeds that developers of conventional relational applications can only dream of.

Globals as a method of storing data can be used in many modern programming languages, both high- and low-level. Therefore, this article will focus specifically on globals and not the language they once came from.

14 10
0 2.3K