Robert Cemper · Jan 23 2m read
Global-Streams-to-SQL #2

Some technical background information

There is not just one class in this package: rcc.gstream.cls but also rcc.gstreamT.cls

While rcc.gstream works with direct access to the stream globals, the *T version uses
a Process Private Global (PPG) as Temporary storage.
using SELECT * FROM RCC.gstreamT WHERE RCC.useT('^jpgS')=1 and similar.

2 0
0 42
Robert Cemper · Jan 23 2m read

In general Global Streams are data objects embedded in Classes / Tables.
Using and viewing them with SQL is normally a part of the access to the containing tables.


During debugging or searching for strange or unexpected behavior there could be the need to
get closer to the stored stream. No big problem with direct access to Globals with SMP or Terminal.
But with SQL you are lost.
So my tool provides dynamic access to Global Streams wherever you may need this
Special thanks to @Oliver Wilms for the inspiration for this tool.

3 0
0 48

I just wrote up a quick sample to help a colleague load data into IRIS from R using RJDBC, and figured it's worth sharing here for future reference.

Ultimately it was pretty simple, aside from IRIS not liking "." in column names; the workaround is to just rename the columns. Someone better at R than me could probably provide some generic approach. smiley

4 2
0 136
Vicky Li · Nov 14, 2016 14m read
Mastering the JDBC SQL Gateway

As we all know, Caché is a great database that accomplishes lots of tasks within itself. However, what do you do when you need to access an external database? One way is to use the Caché SQL Gateway via JDBC. In this article, my goal is to answer the following questions to help you familiarize yourself with the technology and debug some common problems.

12 2
7 3,526

The last days I've work with the great new feature: LOAD DATA With this post I would like to share my first experiences with you. The following points do not contain any order or other evaluation. These are only things that I noticed when using the LOAD DATA command. It should also be noted that these points are based on the IRIS Version 2021.2.0.617 which is a preview release. So it may be that my observations do not apply to newer IRIS versions.

6 4
2 636

Astronomers’ tools

5 years ago, on December 19, 2013, the ESA launched an orbital telescope called Gaia. Learn more about the Gaia mission on the official website of the European Space Agency or in the article by Vitaly Egorov (Billion pixels for a billion stars).

However, few people know what technology the agency chose for storing and processing the data collected by Gaia. Two years before the launch, in 2011, the developers were considering a number of candidates (see “Astrostatistics and Data Mining” by Luis Manuel Sarro, Laurent Eyer, William O’Mullane, Joris De Ridder, pp. 111-112):

Comparing the technologies side-by-side produced the following results (source):

Technology Time
DB2 13min55s
PostgreSQL 8 14min50s
PostgreSQL 9 6min50s
Hadoop 3min37s
Cassandra 3min37s
Caché 2min25s

The first four will probably sound familiar even to schoolchildren. But what is Caché XEP?

10 9
1 882
Guillaume Rongier · Apr 9, 2019 3m read
IRIS/Ensemble as an ETL

IRIS and Ensemble are designed to act as an ESB/EAI. This mean they are build to process lots of small messages.

But some times, in real life we have to use them as ETL. The down side is not that they can't do so, but it can take a long time to process millions of row at once.

To improve performance, I have created a new SQLOutboundAdaptor who only works with JDBC.


Extend EnsLib.SQL.OutboundAdapter to add batch batch and fetch support on JDBC connection.

3 7
2 1,095

One of our apps uses a class query to support a ZEN Report and works just fine in that report, producing the expected results every time. We’ve since migrated to InterSystems Reports and noticed that, for a report using the same class query, 100s of extra rows with the same column values appear at its bottom.

2 0
0 221

DataGrip is a multi-engine database environment targeting the specific needs of professional SQL developers, DataGrip makes working with databases an enjoyable and productive experience.

To work with InterSystems IRIS from DataGrip you'll need to add InterSystems JDBC driver first (once per DataGrip) and after that add all your InterSystems IRIS connections.

Part 1: Add InterSystems IRIS JDBC Driver

1. Go To File → DataSources

1 0
0 438
Anssi Kauppi · Jun 30, 2020 3m read
Replicating Audit Log Near Real Time
Many organisations implement centralised log management systems to separate and centralise the log data in order to e.g. automate threat detection (and response) and to comply with regulatory requirements. The primary systems of interest are the various user facing applications, but increasingly also other kinds of systems including integration platforms.
1 1
1 257

The Caché System Management Portal includes a robust web-based SQL query tool, but for some applications it’s more convenient to use a dedicated SQL client installed on a user’s PC.

SQuirreL SQL is a well known open source SQL client built in Java, which uses JDBC to connect to a DBMS. As such, we can configure SQuirreL to connect to Caché using the Caché JDBC driver.

9 11
1 9,813

Apache Spark has rapidly become one of the most exciting technologies for big data analytics and machine learning. Spark is a general data processing engine created for use in clustered computing environments. Its heart is the Resilient Distributed Dataset (RDD) which represents a distributed, fault tolerant, collection of data that can be operated on in parallel across the nodes of a cluster. Spark is implemented using a combination of Java and Scala and so comes as a library that can run on any JVM.

11 5
0 2,379

I've asked a lot of questions leading up to this, so I wanted to share some of my progress.

The blue line represents the number of messages processed. The background color represents the average response time. You can see ticks for each hour (and bigger ticks for each day). Hovering over any point in the graph will show you the numbers for that period in time.

This is super useful for "at a glance" performance monitoring as well as establishing patterns in our utilization.

5 2
0 442