Clear filter
Article
Evgeniy Potapov · Mar 18, 2024
Pandas is not just a popular software library. It is a cornerstone in the Python data analysis landscape. Renowned for its simplicity and power, it offers a variety of data structures and functions that are instrumental in transforming the complexity of data preparation and analysis into a more manageable form. It is particularly relevant in such specialized environments as ObjectScript for Key Performance Indicators (KPIs) and reporting, especially within the framework of the InterSystems IRIS platform, a leading data management and analysis solution. In the realm of data handling and analysis, Pandas stands out for several reasons. In this article, we aim to explore those aspects in depth:
Key Benefits of Pandas for Data Analysis:
In this part, we will delve into the various advantages of using Pandas. It includes intuitive syntax, efficient handling of large datasets, and the ability to work seamlessly with different data formats. The ease with which Pandas integrates into existing data analysis workflows is also a significant factor that enhances productivity and efficiency.
Solutions to Typical Data Analysis Tasks with Pandas:
Pandas is versatile enough to tackle routine data analysis tasks, ranging from simple data aggregation to complex transformations. We will explore how Pandas can be used to solve these typical challenges, demonstrating its capabilities in data cleaning, transformation, and exploratory data analysis. This section will provide practical insights into how Pandas simplifies these tasks.
Using Pandas Directly in ObjectScript KPIs in IRIS:
The integration of Pandas with ObjectScript for the development of KPIs in the IRIS platform is simply a game-changer. This part will cover how Pandas can be utilized directly within ObjectScript, enhancing the KPI development process. We will also explore practical examples of how Pandas can be employed to analyze and visualize data, thereby contributing to more robust and insightful KPIs.
Recommendations for Implementing Pandas in IRIS Analytic Processes:
Implementing a new tool in an existing analytics process can be challenging. For that reason, this section aims to provide best practices and recommendations for integrating Pandas into the IRIS analytics ecosystem as smoothly as possible. From setup and configuration to optimization and best practices, we will cover essential guidelines to ensure a successful integration of Pandas into your data analysis workflow. Pandas is a powerful data analytics library in the Python programming language. Below, you can find a few benefits of Pandas for data analytics:
Ease of use: Pandas provides a simple and intuitive interface for working with data. It is built on top of the NumPy library and provides such high-level data structures as DataFrames, which makes it easy to work with tabular data.
Data Structures: The principal data structures in Pandas are Series and DataFrame. Series is a one-dimensional array with labels, whereas DataFrame is a two-dimensional table representing a set of Series. These data structures combined allow convenient storage and manipulation of data.
Handling missing data: Pandas provides convenient methods for detecting and handling missing data (NaN or None). It includes some methods for deleting, filling, or replacing missing values, simplifying your work with real data.
Data grouping and aggregation: With Pandas it is easy to group data by features and apply aggregation functions (sum, mean, median, etc.) to each data group.
Powerful indexing capabilities: Pandas provides flexible tools for indexing data. You can use labels, numeric indexes, or multiple levels of indexing. It allows you to filter, select, and manipulate data efficiently.
Reading and writing data: Pandas supports multiple data formats, including CSV, Excel, SQL, JSON, HTML, etc. It facilitates the process of reading and writing data from/to various sources.
Extensive visualization capabilities: Pandas is integrated with such visualization libraries as Matplotlib and Seaborn, making it simple to create graphs and visualize data, especially with the help of DeepSeeWeb through integration via embedded Python.
Efficient time management: Pandas provides multiple features for working with time series, including powerful tools for working with timestamps and periods.
Extensive data manipulation capabilities: The library provides various functions for filtering, sorting, and reshaping data, as well as joining and merging tables, which makes it a powerful tool for data manipulation.
Excellent performance: Pandas is purposefully optimized to handle large amounts of data. It provides high performance by using Cython and enhanced data structures.
Let's look at an example of Pandas' implementation in an ObjectScript environment. We will employ VSCode as our development environment. The choice of IDE in this case was determined by the availability of InterSystems ObjectScript Extension Pack, which provides a debugger and editor for ObjectScript.First of all, let's create a KPI class:
Class BI.KPI.pandasKpi Extends %DeepSee.KPI
{
}
Then, we should make an XML document defining the type, name, and number of columns and filters of our KPI:
XData KPI [ XMLNamespace = "http://www.intersystems.com/deepsee/kpi" ]
{
<!-- 'manual' KPI type will tell DeepSee that data will be gathered from the class method defined by us-->
<kpi name="MembersPandasDemo" sourceType="manual">
<!-- we are going to need only one column for our KPI query -->
<property columnNo="1" name="Members" displayName="Community Members"/>
<!-- and lastly we should define a filter for our members -->
<filter name="InterSystemsMember"
displayName="InterSystems member"
sql="SELECT DISTINCT ISCMember from Community.Member"/>
</kpi>
}
The next step is to define the python function, write the import, and create the necessary variables:
ClassMethod MembersDF(sqlstring) As %Library.DynamicArray [ Language = python ]
{
# First of all, we import the most important library in our script: IRIS.
# IRIS library provides syntax for calling ObjectScript classes.
# It simplifies Python-ObjectScript integration.
# With the help of the library we can call any class and class method, and
# it returns whatever data type we like, and ObjectScript understands it.
import iris
# Then, of course, import the pandas itself.
import pandas as pd
# Create three empty arrays:
Id_list = []
time_list = []
ics_member = []
Next step: define a query against the database:
# Define SQL query for fetching data.
# The query can be as simple as possible.
# All the work will be done by pandas:
query = """
SELECT
id as ID, CAST(TO_CHAR(Created, 'YYYYMM') as Int) as MonthYear, ISCMember as ISC
FROM Community.Member
order by Created DESC
"""
Then, we need to save the resulting data into an array group:
# Call the class specified for executing SQL statements.
# We use embedded Python library to call the class:
sql_class = iris.sql.prepare(query)
# We use it again to call dedicated class methods:
rs = sql_class.execute()
# Then we use pandas directly on the result set to make dataFrame:
data = rs.dataframe()
We also can pass an argument to filter our data frame.
# Filter example
# We take an argument sqlstring which, in this case, contains boolean data.
# With a handy function .loc filtering all the data
if sqlstring is not False:
data = data.loc[data["ISC"] == int(sqlstring)]
Now, we should group the data and define x-axis for it:
# Group data by date displayed like MonthYear:
grouped_data = data.groupby(["MonthYear"]).count()
Unfortunately, we cannot take the date column directly from grouped data DataFrame,so, instead, we take the date column from the original DataFrame and process it.
# Filter out duplicate dates and append them to a list.
# After grouping by MonthYear, pandas automatically filters off duplicate dates.
# We should do the same to match our arrays:
sorted_filtered_dates = [item for item in set(data["MonthYear"])]
# Reverse the dates from left to right:
date = sorted(sorted_filtered_dates, reverse=True)
# Convert dict to a list:
id = grouped_data["ID"].id.tolist()
# Reverse values according to the date array:
id.reverse()
# In order to return the appropriate object to ObjectScript so that it understands it,
# we call '%Library.DynamicArray' (it is the closest one to python and an easy-to-use type of array).
# Again, we use IRIS library inside python code:
OBJIDList = iris.cls('%Library.DynamicArray')._New()
OBJtimeList = iris.cls('%Library.DynamicArray')._New()
# Append all data to DynamicArray class methods Push()
for i in date:
OBJtimeList._Push(i)
for i in ID:
OBJIDList._Push(i)
return OBJIDList, OBJtimeList
}
Next step is to define KPI specific method for DeepSee to understand what data to take:
// Define method. The method must always be %OnLoadKPI(). Otherwise, the system will not recognise it.
Method %OnLoadKPI() As %Status
{
//Define string for the filter. Set the default to zero
set sqlstring = 0
//Call %filterValues method to fetch any filter data from the widget.
if $IsObject(..%filterValues) {
if (..%filterValues.InterSystemsMember'="")
{
set sqlstring=..%filterValues.%data("InterSystemsMember")
}
}
//Call pandas function, pass filter value if any, and receive dynamic arrays with data.
set sqlValue = ..MembersDF(sqlstring)
//Assign each tuple to a variable.
set idList = sqlValue.GetAt(1)
set timeList = sqlValue.GetAt(2)
//Calculate size of x-axis. It will be rows for our widget:
set rowCount = timeList.%Size()
//Since we need only one column, we assign variable to 1:
set colCount = 1
set ..%seriesCount=rowCount
//Now, for each row, assign time value and ID value of our members:
for rows = 1:1:..%seriesCount
{
set ..%seriesNames(rows)=timeList.%Get(rows-1)
for col = 1:1:colCount
{
set ..%data(rows,"Members")=idList.%Get(rows-1)
}
}
quit $$$OK
At this point, compile the KPI and create a widget on a dashboard using KPI data source.
That's it! We have successfully navigated through the process of integrating and utilizing Pandas in our ObjectScript applications on InterSystems IRIS. This journey has taken us from fetching and formatting data to filtering and displaying it, all within a single, streamlined function. This demonstration highlights the efficiency and power of Pandas in data analysis. Now, let's explore some practical recommendations for implementing Pandas within the IRIS environment and conclude with insights on its transformative impact.Recommendations for Practical Application of Pandas in IRIS
Start with Prototyping:
Begin your journey with Pandas by using example datasets and utilities. This approach helps you understand the basics and nuances of Pandas in a controlled and familiar environment. Prototyping allows us to experiment with different Pandas functions and methods without the risks associated with live data.
Gradual Implementation:
Introduce Pandas incrementally into your existing data processes. Instead of a complete overhaul, identify the areas where Pandas can enhance or simplify data handling and analysis. It could be some simple tasks like data cleaning aggregation or a more complex analysis where Pandas capabilities can be fully leveraged.
Optimize Pandas Use:
Prior to working with large datasets, it is crucial to optimize your Pandas code. Efficient code can significantly reduce processing time and resource consumption, which is especially important in large-scale data analysis. Such techniques such as vectorized operations, using appropriate data types, and avoiding loops in data manipulation can significanlty enhance performance.
Conclusion
The integration of Pandas into ObjectScript applications on the InterSystems IRIS platform marks a significant advancement in the field of data analysis. Pandas brings us an array of powerful tools for data processing, analysis, and visualization, which are now at the disposal of IRIS users. This integration not only accelerates and simplifies the development of KPIs and analytics but also paves the way for more sophisticated and advanced data analytical capabilities within the IRIS ecosystem. With Pandas, analysts and developers can explore new horizons in data analytics, leveraging its extensive functionalities to gain deeper insights from their data. The ability to process and analyze large datasets efficiently, coupled with the ease of creating compelling visualizations, empowers users to make more informed decisions and uncover trends and patterns that were previously difficult to detect. In summary, Pandas integration into the InterSystems IRIS environment is a transformative step, enhancing the capabilities of the platform and offering users an expanded toolkit for tackling the ever-growing challenges and complexities of modern data analysis.
Article
Eduard Lebedyuk · Apr 18
For my hundredth article on the Developer Community, I wanted to present something practical, so here's a comprehensive implementation of the [GPG Interoperability Adapter for InterSystems IRIS](https://github.com/intersystems-community/GPG).
Every so often, I would encounter a request for some GPG support, so I had several code samples written for a while, and I thought to combine all of them and add missing GPG functionality for a fairly complete coverage. That said, this Business Operation primarily covers data actions, skipping management actions such as key generation, export, and retrieval as they are usually one-off and performed manually anyways. However, this implementation does support key imports for obvious reasons. Well, let's get into it.
# What is GPG?
[GnuPG](https://gnupg.org/) is a complete and free implementation of the OpenPGP standard as defined by RFC4880 (also known as PGP). GnuPG allows you to encrypt and sign your data and communications and perform the corresponding tasks of decryption and signature verification.
For InterSystems Interoperability adapter, I will be using Embedded Python and [gnupg](https://gnupg.readthedocs.io/en/latest/) Python library specifically.
> The gnupg module allows Python programs to make use of the functionality provided by the GNU Privacy Guard (abbreviated GPG or GnuPG). Using this module, Python programs can encrypt and decrypt data, digitally sign documents and verify digital signatures, manage (generate, list and delete) encryption keys, using Public Key Infrastructure (PKI) encryption technology based on OpenPGP.
# Disclaimer
This project, whenever possible, aims to use GPG defaults. Your organization's security policies might differ. Your jurisdiction or your organization's compliance with various security standards might require you to use GPG with different settings or configurations. The user is wholly responsible for verifying that cryptographic settings are adequate and fully compliant with all applicable regulations. This module is provided under an MIT license. Author and InterSystems are not responsible for any improper or incorrect use of this module.
# Installation
## OS and Python
First, we'll need to install `python-gnupg`, which can be done using `pip` or `irispip`:
```
irispip install --target C:\InterSystems\IRIS\mgr\python python-gnupg
```
If you're on Windows, you should [install GPG itself](https://gnupg.org/download/index.html). GPG binaries must be in the path, and you must restart IRIS after GPG installation. If you're on Linux or Mac, you likely already have GPG installed.
## InterSystems IRIS
After that, load the [code](https://github.com/intersystems-community/GPG) into any Interoperability-enabled namespace and compile it. The code is in `Utils.GPG` package and has the following classes:
- `Operation`: main Business Operation class
- `*Request`: Interoperability request classes
- `*Response`: Interoperability response classes
- `File*`: Interoperability request and response classes using `%Stream.FileBinary` for payload
- `Tests`: code for manual testing, samples
Each request has two properties:
- `Stream` — set that to your payload. In File* requests, your stream must be of the `%Stream.FileBinary` class; for non-file requests, it must be of the `%Stream.GlobalBinary` class.
- `ResponseFilename` — (Optional) If set, the response will be written into a file at the specified location. If not set, for File requests, the response will be placed into a file with `.asc` or `.txt` added to the request filename. If not set, for global stream requests, the response will be a global stream.
The request type determines the GPG operation to perform. For example, `EncryptionRequest` is used to encrypt plaintext payloads.
Each response (except for Verify) has a `Stream` property, which holds the response, and a `Result` property, which holds a serializable subset of a GPG result object converted into IRIS persistent object. The most important property of a `Result` object is a boolean `ok`, indicating overall success.
## Sample key generation
Next, you need a sample key; skip this step if you already have one ([project repo](https://github.com/intersystems-community/GPG) also contains sample keys, you can use them for debugging, passphrase is `123456`):
Use any Python shell (for example, `do $system.Python.Shell()`):
```python
import gnupg
gpg_home = 'C:\InterSystems\IRIS\Mgr\pgp'
gpg = gnupg.GPG(gnupghome=gpg_home)
input_data = gpg.gen_key_input(key_type="RSA", key_length=2048)
master_key = gpg.gen_key(input_data)
public_key = 'C:\InterSystems\IRIS\Mgr\keys\public.asc'
result_public = gpg.export_keys(master_key.fingerprint, output=public_key)
private_key = 'C:\InterSystems\IRIS\Mgr\keys\private.asc'
result_private = gpg.export_keys(master_key.fingerprint, True, passphrase="", output=private_key)
```
You must set `gpg_home`, `private_key`, and `public_key` to valid paths. Note that a private key can only be exported with a passphrase.
# Production configuration
Add `Utils.GPG.Operation` to your Production, there are four custom settings available:
- `Home`: writable directory for GPG to keep track of an internal state.
- `Key`: path to a key file to import
- `Credentials`: if a key file is passphrase protected, select a Credential with a password to be used as a passphrase.
- `ReturnErrorOnNotOk`: If this is `False` and the GPG operation fails, the response will be returned with all the info we managed to collect. If this is `True`, any GPG error will result in an exception.
On startup, the operation loads the key and logs `GPG initialized` if everything is okay. After that, it can accept all request types based on an imported key (a public key can only encrypt and verify).
# Usage
Here's a sample encryption request:
```objectscript
/// do ##class(Utils.GPG.Tests).Encrypt()
ClassMethod Encrypt(target = {..#TARGET}, plainFilename As %String, encryptedFilename As %String)
{
if $d(plainFilename) {
set request = ##class(FileEncryptionRequest).%New()
set request.Stream = ##class(%Stream.FileBinary).%New()
$$$TOE(sc, request.Stream.LinkToFile(plainFilename))
} else {
set request = ##class(EncryptionRequest).%New()
set request.Stream = ##class(%Stream.GlobalBinary).%New()
do request.Stream.Write("123456")
$$$TOE(sc, request.Stream.%Save())
}
if $d(encryptedFilename) {
set request.ResponseFilename = encryptedFilename
}
set sc = ##class(EnsLib.Testing.Service).SendTestRequest(target, request, .response, .sessionid)
zw sc, response, sessionid
}
```
In the same manner, you can perform Decryption, Sign, and Verification requests. Check `Utils.GPG.Tests` for all the examples.
# Why Business Operation?
While writing this, I received a very interesting question about why GPG needs to be a separate Business Host and not a part of a Business Process. As this can be very important for any cryptography code, I wanted to include my rationale on that topic.
I would like to start with how Business Processes work and why this is a crucial consideration for cryptography code.
Consider this simple Business Process `User.BPL`:
```xml
```
It will generate the `Context` class with one property:
```objectscript
Class User.BPL.Context Extends Ens.BP.Context
{
Property x As %Integer;
}
```
And `State` class with two methods (simplified):
```objectscript
Method S1(process As User.BPL, context As User.BPL.Context)
{
Set context.x=1
Set ..%NextState="S2"
Quit ..ManageState()
}
Method S2(process As User.BPL, context As User.BPL.Context)
{
Set context.x=2
Set ..%NextState="S3"
Quit ..ManageState()
}
```
Since BP is a state machine, it will simply call the first state and then whatever is set in `%NextState`. Each state has information on all possible next states—for example, one next state for a true path and another for a false path in the if block state.
However, the BP engine manages the state between state invocations. In our case, it saves the `User.BPL.Context` object which holds an entire context - property `x`.
But there's no guarantee that after saving the state of a particular BP invocation, the subsequent state for this invocation would be called next immediately.
The BP engine might wait for a reply from BO/BP, work on another invocation, or even work on another process entirely if we're using shared pool workers. Even with a dedicated worker pool, another worker might grab the same process invocation to continue working on it.
This is usually fine since the worker's first action before executing the next state is loading the context from memory—in our example, it's an object of the `User.BPL.Context` class with one integer property `x`, which works.
But in the case of any cryptography library, the context must contain something along the lines of:
```objectscript
/// Private Python object holding GPG module
Property %GPG As %SYS.Python;
```
Which is a runtime Python module object that cannot be persisted. It also likely cannot be pickled or even dilled as we initialize a crypto context to hold a key — the library is rather pointless without it, after all.
So, while theoretically, it could work if the entire cryptography workload (idempotent init – idempotent key load - encryption - signing) is handled within one state, that is a consideration that must always be carefully observed. Especially since, in many cases, it will work in low-load environments (i.e., dev) where there's no queue to speak of, and one BP invocation will likely progress from beginning to end uninterrupted. But when the same code is promoted to a high-load environment with queues and resource contention (i.e., live), the BP engine is likelier to switch between different invocations to speed things up.
That's why I highly recommend extracting your cryptography code into a separate business operation. Since one business operation can handle multiple message types, you can have one business operation that processes PGP signing/encryption/verification requests. Since BOs (and BSes) are not state machines, once you load the library and key(s) in the init code, they will not be unloaded until your BH job expires one way or another.
# Conclusion
GPG Interoperability Adapter for InterSystems IRIS allows you to use GPG easily if you need Encryption/Decryption and Signing/Verification.
# Documentation
- [GnuPG](https://gnupg.org/)
- [Python GnuPG](https://gnupg.readthedocs.io/en/latest/)
- [OpenExchange](https://openexchange.intersystems.com/package/GPG)
- [Repo](https://github.com/intersystems-community/GPG)
Announcement
Anastasia Dyubaylo · Apr 7
Hi Community,
It's time to announce the winners of the AI Programming Contest: Vector Search, GenAI and AI Agents!
Thanks to all our amazing participants who submitted 15 applications 🔥
Now it's time to announce the winners!
Experts Nomination
🥇 1st place and $5,000 go to the bg-iris-agent app by @geotat, @Elena.Karpova, @Alena.Krasinskiene
🥈 2nd place and $2,500 go to the mcp-server-iris app by @Dmitry.Maslennikov
🥉 3rd place and $1,000 go to the langchain-iris-tool app by @Yuri.Gomes
🏅 4th place and $500 go to the Facilis app by @Henrique, @henry, @José.Pereira
🏅 5th place and $300 go to the toot app by @Alex.Woodhead
🌟 $100 go to the iris-AgenticAI app by @Muhammad.Waseem
🌟 $100 go to the iris-easybot app by @Eric.Fortenberry
🌟 $100 go to the oncorag app by Patrick Salome
🌟 $100 go to the AiAssistant app by @XININGMA
🌟 $100 go to the iris-data-analysis app by @lando.miller
Community Nomination
🥇 1st place and $1,000 go to the AiAssistant app by @XININGMA
🥈 2nd place and $600 go to the bg-iris-agent app by @geotat, @Elena.Karpova, @Alena.Krasinskiene
🥉 3rd place and $300 go to the iris-data-analysis app by @lando.miller
🏅 4th place and $200 go to the Facilis app by @Henrique, @henry, @José.Pereira
🏅 5th place and $100 go to the langchain-iris-tool app by @Yuri.Gomes
Our sincerest congratulations to all the participants and winners!
Join the fun next time ;)
Article
Yuri Marx Pereira Gomes · Nov 1, 2021
The InterSystems IRIS is a great data platform and it is met the current features required by the market. In this article, you see the top 10:
Note: this list was updated because many features are added to IRIS in last 3 years (thanks @Kristina.Lauer)
Rank
Feature
Why
Learning more about it
1
Democratized analytics
InterSystems IRIS Adaptive Analytics:Delivers virtual cubes with centralized business semantics, abstracted from technical details and modeling, to allow business users to easily and quickly create their analyses in Excel or their preferred analytics product (PowerBI, Tableau, etc.). There are no consumption restrictions per user.InterSystems Reports:It is a low code report designer to deliver operational data reports embedded on any application or in a web report portal.
Overview of Adaptive Analytics, Adaptive Analytics Essentials
Introduction to InterSystems Reports,Delivering Data Visually with InterSystems Reports
2
API Manager
The digital assets are consumed using API REST. Is required govern the reuse, security, consuming, asset catalog, developer ecosystem and others aspects in a central point. The API Manager is the right tool to do this. So, all the companies have or want to have an API Manager.
Hands-On with API Manager for Devs
3
Scalable Databases
Sharding DatabaseThe total amount of data created, captured, copied, and consumed globally is forecast to increase rapidly, reaching 64.2 zettabytes in 2020. Over the next five years up to 2025, global data creation is projected to grow to more than 180 zettabytes. In 2020, the amount of data created and replicated reached a new high (source: https://www.statista.com/ statistics/871513/worldwide-data-created/). In this scenario, is critical to the business be able to process data in a distributed way (into shards, like hadoop, or mongodb), to increase and mantain the performance. The other important thing is the IRIS is 3 times more rapid then Cache, and more rapid then AWS databases, into the AWS cloud.Columnar storageChanges the storage of repeating data into columns instead of rows, allowing you to achieve up to 10x higher performance, especially in aggregated (analytical) data storage scenarios.
Planning and Deploying a Sharded Cluster
Scaling for Data Volume with Sharding
Increasing Analytical Query Speed Using Columnar StorageUsing Columnar Storage
4
Python support
Python is the most popular language to do AI and AI is in the center of the business strategy, because allows you get new insights, get more productivity and reduce costs.
Writing Python Applications with InterSystems
Leveraging Embedded Python in Interoperability Productions
5
Native APIs (Java, .NET, Node.js, Python) and PEX
The US has nearly 1 million open IT jobs (source: https://www.cnbc.com/2019/11/06/ how-switching-careers-to-tech-could-solve-the-us-talent-shortage.html). Is very hard find an Object Script developer. So, is important be able use IRIS features, like interoperability with the developer team official programming language (Python, Java, .NET, etc.).
Creating Interoperability Productions Using PEX, InterSystems IRIS for Coders, Node.js QuickStart, Using the Native API for Python
6
Interoperability, FHIR and IoT
Businesses are constantly connecting and exchanging data. Departments also need to work connected to deliver business processes with more strategic value and lower cost. The best technology to do this, is the interoperability tools, especially ESB, Integration Adapters, Business Process automation engines (BPL), data transformation tools (DTL) and the adoption of market interoperability standards, like FHIR and MQTT/IoT. The InterSystems Interoperability supports all this (for FHIR use IRIS for Health).
Receiving and Routing Data in a Production, Building Basic FHIR Integrations with InterSystems, Monitoring Remotely with MQTT, Building Business Integrations with InterSystems IRIS
7
Cloud, Docker & Microservices
Everyone now wants cloud microservices architecture. They want to break the monoliths to create projects that are smaller, less complex, less coupled, more scalable, reusable, and independent. IRIS allows you deploy data, application and analytics microservices, thanks IRIS support to shards, docker, kubernetes, distributed computing, DevOps tools and lower CPU/memory consumption (IRIS supports even ARM processors!). But microservices requires the microservice API management, using API Manager, to be used aligned to the business.
Deploying InterSystems IRIS in Containers and the Cloud
Deploying and Testing InterSystems Products Using CI/CD Pipelines
8
Vector Search and Generative AI
Vectors are mathematical representations of data and textual semantics (NLP), and are the raw material for generative AI applications to understand questions and tasks and return correct answers. Vector repositories and searches are capable of storing vectors (AI processing) so that for each new task or question, they can retrieve what has already been produced (AI memory or knowledge base), making everything faster and cheaper.
Developing Generative AI Applications, Using Vector Search
9
VSCode support
VSCode is the most popular IDE and InterSystems IRIS has a good set of tools for it.
Developing on an InterSystems Server Using VS Code
10
Data Science
The ability to apply data science to the data, integration and transaction requests and responses, using Python, R and IntegratedML (AutoML) enable AI intelligence at the moment is required by the business. The InterSystems IRIS deliver AI with Python, R and IntegratedML (AutoML)
Hands-On with IntegratedML
Developing in Python or R within InterSystems IRIS
Predicting Outcomes with IntegratedML in InterSystems IRIS
nice summary :) 💡 This article is considered as InterSystems Data Platform Best Practice. I updated this article, thanks @Kristina.Lauer
Article
Kristina Lauer · Jul 29, 2024
Updated 2/27/25
Hi Community,
You can unlock the full potential of InterSystems IRIS—and help your team onboard—with the full range of InterSystems learning resources offered online and in person, for every role in your organization. Developers, system administrators, data analysts, and integrators can quickly get up to speed.
Onboarding Resources for Every Role
Developers
Online Learning Program: Getting Started with InterSystems IRIS for Coders (21h)
Classroom Training: Developing with InterSystems Objects and SQL (5 days)
System Administrators
Learning Path: InterSystems IRIS Management Basics (10h)
Classroom Training: Managing InterSystems Servers (5 days)
Data Analysts
Video: Introduction to Analytics with InterSystems (6m)
Learning Paths for every tool:
Analyzing Data with InterSystems IRIS BI
Delivering Data Visually with InterSystems Reports (1h 15m)
Build Data Models Using Adaptive Analytics (2h 15m)
Classroom Training: Using InterSystems Embedded Analytics (5 days)
Integrators
Learning Program: Getting Started with InterSystems IRIS for Health for Integrators (14h)
Classroom Training: Developing System Integrations and Building and Managing HL7 Integrations (5 days each)
Implementers
Learning Path: Deploying InterSystems IRIS in Containers and the Cloud (3h)
Learning Program: Getting Started with InterSystems IRIS for Implementers (26h)
Project managers
Watch product overview videos.
Read success stories to get inspired—see how others are using InterSystems products!
Other Resources from Learning Services
💻 Online Learning: Register for free at learning.intersystems.com to access self-paced courses, videos, and exercises. You can also complete task-based learning paths or role-based programs to advance your career.
👩🏫 Classroom Training: Check the schedule of live, in-person or virtual classroom training, or request a private course for your team. Find details at classroom.intersystems.com.
📘 InterSystems IRIS documentation: Comprehensive reference materials, guides, and how-to articles. Explore the documentation.
📧 Support: For technical support, email support@intersystems.com.
Certification Opportunities
Once you and your team members have gained enough training and experience, get certified according to your role!
Learn from the Community
💬Engage in learning on the Developer Community: Chat with other developers, post questions, read articles, and stay updated with the latest announcements. See this post for tips on how to learn on the Developer Community.
With these learning resources, your team will be well equipped to maximize the capabilities of InterSystems IRIS, driving your organization’s growth and success. For additional assistance, post questions here or ask your dedicated Sales Engineer. New certification opportunity added to the list: InterSystems IRIS SQL Specialist! Resources for implementers added!
Announcement
Anastasia Dyubaylo · Apr 11
Hi Community,
We're happy to announce that registration for the event of the year — InterSystems Ready 2025 — is now open. This is the Global Summit we all know and love, but with a new name!
➡️ InterSystems Ready 2025
🗓 Dates: June 22-25, 2025
📍 Location: Signia Hilton Bonnet Creek, Orlando, FL, USA
InterSystems READY 2025 is a friendly and informative environment for the InterSystems community to meet, interact, and exchange knowledge.
READY 2025 event includes:
Sessions: 3 and a half days of sessions geared to the needs of software developers and managers. Sessions repeat so you don’t have to miss out as you build your schedule.
Inspiring keynotes: Presentations that challenge your assumptions and highlight new possibilities.
What’s next: In the keynotes and breakout sessions you’ll learn what’s on the InterSystems roadmap, so you’ll be ready to go when new tech is released.
Networking: Meet InterSystems executives, members of our global product and innovation teams, and peers from around the world to discuss what matters most to you.
Workshops and personal training: Dive into exactly what you need with an InterSystems expert, including one-on-ones.
Startup program: Demonstrate your tech, connect with potential buyers, and learn how InterSystems can help you accelerate growth of your business.
Partner Pavilion: Looking for a consultant, systems integrator, tools to simplify your work? It’s all in the pavilion.
Fun: Demos and Drinks, Tech Exchange, and other venues.
Learn more about the prices on the official website and don't forget that the super early bird discount lapses on April 16th!
We look forward to seeing you at the InterSystems Ready 2025!
Article
Murray Oldfield · Nov 29, 2016
This post provides guidelines for configuration, system sizing and capacity planning when deploying Caché 2015 and later on a VMware ESXi 5.5 and later environment.
I jump right in with recommendations assuming you already have an understanding of VMware vSphere virtualization platform. The recommendations in this guide are not specific to any particular hardware or site specific implementation, and are not intended as a fully comprehensive guide to planning and configuring a vSphere deployment -- rather this is a check list of best practice configuration choices you can make. I expect that the recommendations will be evaluated for a specific site by your expert VMware implementation team.
[A list of other posts in the InterSystems Data Platforms and performance series is here.](https://community.intersystems.com/post/capacity-planning-and-performance-series-index)
_Note:_ This post was updated on 3 Jan 2017 to highlight that VM memory reservations must be set for production database instances to guarantee memory is available for Caché and there will be no swapping or ballooning which will negatively impact database performance. See the section below *Memory* for more details.
### References
The information here is based on experience and reviewing publicly available VMware knowledge base articles and VMware documents for example [Performance Best Practices for VMware vSphere](https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/performance/vsphere-esxi-vcenter-server-67-performance-best-practices.pdf) and mapping to requirements of Caché database deployments.
## Are InterSystems' products supported on ESXi?
It is InterSystems policy and procedure to verify and release InterSystems’ products against processor types and operating systems including when operating systems are virtualised. For specifics see [InterSystems support policy](http://www.intersystems.com/services-support/product-support/virtualization/) and [Release Information](http://www.intersystems.com/services-support/product-support/latest-platform/).
>For example: Caché 2016.1 running on Red Hat 7.2 operating system on ESXi on x86 hosts is supported.
Note: If you do not write your own applications you must also check your application vendors support policy.
### Supported Hardware
VMware virtualization works well for Caché when used with current server and storage components. Caché using VMware virtualization has been deployed succesfully at customer sites and has been proven in benchmarks for performance and scalability. There is no significant performance impact using VMware virtualization on properly configured storage, network and servers with later model Intel Xeon processors, specifically: Intel Xeon 5500, 5600, 7500, E7-series and E5-series (including the latest E5 v4).
Generally Caché and applications are installed and configured on the guest operating system in the same way as for the same operating system on bare-metal installations.
It is the customers responsibility to check the [VMware compatibility guide](http://www.vmware.com/resources/compatibility/search.php) for the specific servers and storage being used.
# Virtualised architecture
I see VMware commonly used in two standard configurations with Caché applications:
- Where primary production database operating system instances are on a ‘bare-metal’ cluster, and VMware is only used for additional production and non-production instances such as web servers, printing, test, training and so on.
- Where ALL operating system instances, including primary production instances are virtualized.
This post can be used as a guide for either scenario, however the focus is on the second scenario where all operating system instances including production are virtualised. The following diagram shows a typical physical server set up for that configuration.
_Figure 1. Simple virtualised Caché architecture_
Figure 1 shows a common deployment with a minimum of three physical host servers to provide N+1 capacity and availability with host servers in a VMware HA cluster. Additional physical servers may be added to the cluster to scale resources. Additional physical servers may also be required for backup/restore media management and disaster recovery.
For recommendations specific to _VMware vSAN_, VMware's Hyper-Converged Infrastructure solution, see the following post: [Part 8 Hyper-Converged Infrastructure Capacity and Performance Planning](https://community.intersystems.com/post/intersystems-data-platforms-and-performance-%E2%80%93-part-8-hyper-converged-infrastructure-capacity). Most of the recommendations in this post can be applied to vSAN -- with the exception of some of the obvious differences in the Storage section below.
# VMWare versions
The following table shows key recommendations for Caché 2015 and later:
vSphere is a suite of products including vCenter Server that allows centralised system management of hosts and virtual machines via the vSphere client.
>This post assumes that vSphere will be used, not the "free" ESXi Hypervisor only version.
VMware has several licensing models; ultimately choice of version is based on what best suits your current and future infrastructure planning.
I generally recommend the "Enterprise" edition for its added features such as Dynamic Resource Scheduling (DRS) for more efficient hardware utilization and Storage APIs for storage array integration (snapshot backups). The VMware web site shows edition comparisons.
There are also Advanced Kits that allow bundling of vCenter Server and CPU licenses for vSphere. Kits have limitations for upgrades so are usually only recommended for smaller sites that do not expect growth.
# ESXi Host BIOS settings
The ESXi host is the physical server. Before configuring BIOS you should:
- Check with the hardware vendor that the server is running the latest BIOS
- Check whether there are any server/CPU model specific BIOS settings for VMware.
Default settings for server BIOS may not be optimal for VMware. The following settings can be used to optimize the physical host servers to get best performance. Not all settings in the following table are available on all vendors’ servers.
# Memory
The following key rules should be considered for memory allocation:
When running multiple Caché instances or other applications on a single physical host VMware has several technologies for efficient memory management such as transparent page sharing (TPS), ballooning, swap, and memory compression. For example when multiple OS instances are running on the same host TPS allows overcommitment of memory without performance degradation by eliminating redundant copies of pages in memory, which allows virtual machines to run with less memory than on a physical machine.
>Note: VMware Tools must be installed in the operating system to take advantage of these and many other features of VMware.
Although these features exist to allow for overcommitting memory, the recommendation is to always start by sizing vRAM of all VMs to fit within the physical memory available. Especially important in production environments is to carefully consider the impact of overcommitting memory and overcommit only after collecting data to determine the amount of overcommitment possible. To determine the effectiveness of memory sharing and the degree of acceptable overcommitment for a given Caché instance, run the workload and use Vmware commands `resxtop` or `esxtop` to observe the actual savings.
A good reference is to go back and look at the [fourth post in this series on memory](https://community.intersystems.com/post/intersystems-data-platforms-and-performance-part-4-looking-memory) when planning your Caché instance memory requirements. Especially the section "VMware Virtualisation considerations" where I point out:
>Set VMware memory reservation on production systems.
You want to *must* avoid any swapping for shared memory so set your production database VMs memory reservation to at least the size of Caché shared memory plus memory for Caché processes and operating system and kernel services. If in doubt **Reserve the full production database VMs memory (100% reservation)** to guarantee memory is available for your Caché instance so there will be no swapping or ballooning which will negatively impact database performance.
Notes: Large memory reservations will impact vMotion operations so it is important to take this into consideration when designing the vMotion/management network. A virtual machine can only be live migrated, or started on another host with Vmware HA if the target host has free physical memory greater than or equal to the size of the reservation. This is especially important for production Caché VMs. For example pay particular attention to HA Admission Control policies.
>Ensure capacity planning allows for distribution of VMs in event of HA failover.
For non-production environments (test, train, etc) more aggressive memory overcommitment is possible, however do not over commit Caché shared memory, instead limit shared memory in the Caché instance by having less global buffers.
Current Intel processor architecture has a NUMA topology. Processors have their own local memory and can access memory on other processors in the same host. Not surprisingly accessing local memory has lower latency than remote. For a discussion of CPU check out the [third post in this series](https://community.intersystems.com/post/intersystems-data-platforms-and-performance-%E2%80%93-part-3-focus-cpu) including a discussion about NUMA in the _comments section_.
As noted in the BIOS section above a strategy for optimal performance is to ideally size VMs only up to maximum of number of cores and memory on a single processor. For example if your capacity planning shows your biggest production Caché database VM will be 14 vCPUs and 112 GB memory then consider whether a a cluster of servers with 2x E5-2680 v4 (14-core processor) and 256 GB memory is a good fit.
>**Ideally** size VMs to keep memory local to a NUMA node. But dont get too hung up on this.
If you need a "Monster VM" bigger than a NUMA node that is OK, VMware will manage NUMA for optimal performance. It also important to right-size your VMs and not allocate more resources than are needed (see below).
## CPU
The following key rules should be considered for virtual CPU allocation:
Production Caché systems should be sized based on benchmarks and measurements at live customer sites. For production systems use a strategy of initially sizing the system the same as bare-metal CPU cores and as per best practice monitoring to see if virtual CPUS (vCPUs) can be reduced.
### Hyperthreading and capacity planning
A good starting point for sizing __production database__ VMs based on your rules for physical servers is to calculate physical server CPU requirements for the target processor with hyper-threading enabled then simply make the transaltaion:
>One physical CPU (includes hyperthreading) = One vCPU (includes hyperthreading).
A common misconception is that hyper-threading somehow doubles vCPU capacity. This is NOT true for physical servers or for logical vCPUs. Hyperthreading on a bare-metal server may give a 30% uplift in performance over the same server without hyperthreading, but this can also be variable depending on the application.
For initial sizing assume is that the vCPU has full core dedication. For example; if you have a 32-core (2x 16-core) E5-2683 V4 server – size for a total of up to 32 vCPU capacity knowing there may be available headroom. This configuration assumes hyper-threading is enabled at the host level. VMware will manage the scheduling between all the applications and VMs on the host. Once you have spent time monitoring the appliaction, operating system and VMware performance during peak processing times you can decide if higher consolidation is possible.
### Licencing
In vSphere you can configure a VM with a certain number of sockets or cores. For example, if you have a dual-processor VM (2 vCPUs), it can be configured as two CPU sockets, or as a single socket with two CPU cores. From an execution standpoint it does not make much of a difference because the hypervisor will ultimately decide whether the VM executes on one or two physical sockets. However, specifying that the dual-CPU VM really has two cores instead of two sockets could make a difference for software licenses. Note: Caché license counts the cores (not threads).
# Storage
>This section applies to the more traditional storage model using a shared storage array. For _vSAN_ recommendations also see the following post: [Part 8 Hyper-Converged Infrastructure Capacity and Performance Planning](https://community.intersystems.com/post/intersystems-data-platforms-and-performance-%E2%80%93-part-8-hyper-converged-infrastructure-capacity)
The following key rules should be considered for storage:
## Size storage for performance
Bottlenecks in storage is one of the most common problems affecting Caché system performance, the same is true for VMware vSphere configurations. The most common problem is sizing storage simply for GB capacity, rather than allocating a high enough number of spindles to support expected IOPS. Storage problems can be even more severe in VMware because more hosts can be accessing the same storage over the same physical connections.
## VMware Storage overview
VMware storage virtualization can be categorized into three layers, for example:
- The storage array is the bottom layer, consisting of physical disks presented as logical disks (storage array volumes or LUNs) to the layer above.
- The next layer is the virtual environment occupied by vSphere. Storage array LUNs are presented to ESXi hosts as datastores and are formatted as VMFS volumes.
- Virtual machines are made up of files in the datastore and include virtual disks are presented to the guest operating system as disks that can be partitioned and used in file systems.
VMware offers two choices for managing disk access in a virtual machine—VMware Virtual Machine File System (VMFS) and raw device mapping (RDM), both offer similar performance. For simple management VMware generally recommends VMFS, but there may be situations where RDMs are required. As a general recommendation – unless there is a particular reason to use RDM choose VMFS, _new development by VMware is directed to VMFS and not RDM._
###Virtual Machine File System (VMFS)
VMFS is a file system developed by VMware that is dedicated and optimized for clustered virtual environments (allows read/write access from several hosts) and the storage of large files. The structure of VMFS makes it possible to store VM files in a single folder, simplifying VM administration. VMFS also enables VMware infrastructure services such as vMotion, DRS and VMware HA.
Operating Systems, applications, and data are stored in virtual disk files (.vmdk files). vmdk files are stored in the Datastore. A single VM can be made up of multiple vmdk files spread over several datastores. As the production VM in the diagram below shows a VM can include storage spread over several data stores. For production systems best performance is achieved with one vmdk file per LUN, for non-production systems (test, training etc) multiple VMs vmdk files can share a datastore and a LUN.
While vSphere 5.5 has a maximum VMFS volume size of 64TB and VMDK size of 62TB when deploying Caché typically multiple VMFS volumes mapped to LUNs on separate disk groups are used to separate IO patterns and improve performance. For example random or sequential IO disk groups or to separate production IO from IO from other environments.
The following diagram shows an overview of an example VMware VMFS storage used with Caché:
_Figure 2. Example Caché storage on VMFS_
### RDM
RDM allows management and access of raw SCSI disks or LUNs as VMFS files. An RDM is a special file on a VMFS volume that acts as a proxy for a raw device. VMFS is recommended for most virtual disk storage, but raw disks might be desirable in some cases. RDM is only available for Fibre Channel or iSCSI storage.
### VMware vStorage APIs for Array Integration (VAAI)
For the best storage performance, customers should consider using VAAI-capable storage hardware. VAAI can improve the performance in several areas including virtual machine provisioning and of thin-provisioned virtual disks. VAAI may be available as a firmware update from the array vendor for older arrays.
### Virtual Disk Types
ESXi supports multiple virtual disk types:
**Thick Provisioned** – where space is allocated at creation. There are further types:
- Eager Zeroed – writes 0’s to the entire drive. This increases the time it takes to create the disk, but results in the best performance, even on the first write to each block.
- Lazy Zeroed – writes 0’s as each block is first written to. Lazy zero results in a shorter creation time, but reduced performance the first time a block is written to. Subsequent writes, however, have the same performance as on eager-zeroed thick disks.
**Thin Provisioned** – where space is allocated and zeroed upon write. There is a higher I/O cost (similar to that of lazy-zeroed thick disks) during the first write to an unwritten file block, but on subsequent writes thin-provisioned disks have the same performance as eager-zeroed thick disks
_In all disk types VAAI can improve performance by offloading operations to the storage array._ Some arrays also support thin provisioning at the array level, do not thin provision ESXi disks on thin provisioned array storage as there can be conflicts in provisioning and management.
### Other Notes
As noted above for best practice use the same strategies as bare-metal configurations; production storage may be separated at the array level into several disk groups:
- Random access for Caché production databases
- Sequential access for backups and journals, but also a place for other non-production storage such as test, train, and so on
Remember that a datastore is an abstraction of the storage tier and, therefore, it is a logical representation not a physical representation of the storage. Creating a dedicated datastore to isolate a particular I/O workload (whether journal or database files), without isolating the physical storage layer as well, does not have the desired effect on performance.
Although performance is key, choice of shared storage depends more on existing or planned infrastructure at site than impact of VMware. As with bare-metal implementations FC SAN is the best performing and is recommended. For FC 8Gbps adapters are the recommended minimum. iSCSI storage is only supported if appropriate network infrastructure is in place, including; minimum 10Gb Ethernet and jumbo frames (MTU 9000) must be supported on all components in the network between server and storage with separation from other traffic.
Use multiple VMware Paravirtual SCSI (PVSCSI) controllers for the database virtual machines or virtual machines with high I/O load. PVSCSI can provide some significant benefits by increasing overall storage throughput while reducing CPU utilization. The use of multiple PVSCSI controllers allows the execution of several parallel I/O operations inside the guest operating system. It is also recommended to separate journal I/O traffic from the database I/O traffic through separate virtual SCSI controllers. As a best practice, you can use one controller for the operating system and swap, another controller for journals, and one or more additional controllers for database data files (depending on the number and size of the database data files).
Aligning file system partitions is a well-known storage best practice for database workloads. Partition alignment on both physical machines and VMware VMFS partitions prevents performance I/O degradation caused by I/O crossing track boundaries. VMware test results show that aligning VMFS partitions to 64KB track boundaries results in reduced latency and increased throughput. VMFS partitions created using vCenter are aligned on 64KB boundaries as recommended by storage and operating system vendors.
# Networking
The following key rules should be considered for networking:
As noted above VMXNET adapaters have better capabilities than the default E1000 adapter. VMXNET3 allows 10Gb and uses less CPU where as E1000 is only 1Gb. If there is only 1 gigabit network connections between hosts there is not a lot of difference for client to VM communication. However with VMXNET3 it will allow 10Gb between VMs on the same host, which does make a difference especially in multi-tier deployments or where there is high network IO requirements between instances. This feature should also be taken into consideration when planning affinity and antiaffinity DRS rules to keep VMs on the same or separate virtual switches.
The E1000 use universal drivers that can be used in Windows or Linux. Once VMware Tools is installed on the guest operating system VMXNET virtual adapters can be installed.
The following diagram shows a typical small server configuration with four physical NIC ports, two ports have been configured within VMware for infrastructure traffic: dvSwitch0 for Management and vMotion, and two ports for application use by VMs. NIC teaming and load balancing is used for best throughput and HA.
_Figure 3. A typical small server configuration with four physical NIC ports._
# Guest Operating Systems
The following are recommended:
>It is very important to load VMware tools in to all VM operating systems and keep the tools current.
VMware Tools is a suite of utilities that enhances the performance of the virtual machine's guest operating system and improves management of the virtual machine. Without VMware Tools installed in your guest operating system, guest performance lacks important functionality.
Its vital that the time is set correctly on all ESXi hosts - it ends up affecting the Guest VMs. The default setting for the VMs is not to sync the guest time with the host - but at certain times the guest still do sync their time with the host and if the time is out has been known to cause major issues. VMware recommends using NTP instead of VMware Tools periodic time synchronization. NTP is an industry standard and ensures accurate timekeeping in your guest. It may be necessary to open the firewall (UDP 123) to allow NTP traffic.
# DNS Configuration
If your DNS server is hosted on virtualized infrastructure and becomes unavailable, it prevents vCenter from resolving host names, making the virtual environment unmanageable -- however the virtual machines themselves keep operating without problem.
# High Availability
High availability is provided by features such as VMware vMotion, VMware Distributed Resource Scheduler (DRS) and VMware High Availability (HA). Caché Database mirroring can also be used to increase uptime.
It is important that Caché production systems are designed with n+1 physical hosts. There must be enough resources (e.g. CPU and Memory) for all the VMs to run on remaining hosts in the event of a single host failure. In the event of server failure if VMware cannot allocate enough CPU and memory resources on the remaining server VMware HA will not restart VMs on the remaining servers.
## vMotion
vMotion can be used with Caché. vMotion allows migration of a functioning VM from one ESXi host server to another in a fully transparent manner. The OS and applications such as Caché running in the VM have no service interruption.
When migrating using vMotion, only the status and memory of the VM—with its configuration—moves. The virtual disk does not need to move; it stays in the same shared-storage location. Once the VM has migrated, it is operating on the new physical host.
vMotion can function only with a shared storage architecture (such as Shared SAS array, FC SAN or iSCSI). As Caché is usually configured to use a large amount of shared memory it is important to have adequare network capacity available to vMotion, a 1Gb nework may be OK, however higher bandwidth may be required or multi-NIC vMotion can be configured.
## DRS
Distributed Resource Scheduler (DRS) is a method of automating the use of vMotion in a production environment by sharing the workload among different host servers in a cluster.
DRS also presents the ability to implement QoS for a VM instance to protect resources for Production VMs by stopping non-production VMs over using resources. DRS collects information about the use of the cluster’s host servers and optimize resources by distributing the VMs’ workload among the cluster’s different servers. This migration can be performed automatically or manually.
## Caché Database Mirror
For mission critical tier-1 Caché database application instances requiring the highest availability consider also using [InterSystems synchronous database mirroring.](http://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY=GHA_mirror#GHA_mirror_set_bp_vm) Additional advantages of also using mirroring include:
- Separate copies of up-to-date data.
- Failover in seconds (faster than restarting a VM then operating System then recovering Caché).
- Failover in case of application/Caché failure (not detected by VMware).
# vCenter Appliance
The vCenter Server Appliance is a preconfigured Linux-based virtual machine optimized for running vCenter Server and associated services. I have been recommending sites with small clusters to use the VMware vCenter Server Appliance as an alternative to installing vCenter Server on a Windows VM. In vSphere 6.5 the appliance is recommended for all deployments.
# Summary
This post is a rundown of key best practices you should consider when deploying Caché on VMware. Most of these best practices are not unique to Caché but can be applied to other tier-1 business critical deployments on VMware.
If you have any questions please let me know via the comments below. Good Day Murray;
Anything to look out for VMs hosting Ensemble Databases on vMware being part of VMware Site Recovery Managermaking use of vSphere Replication? Can it be alone be safely used to boot up the VM on the other site?
Regards;
Anzelem.
Announcement
Benjamin De Boe · Jun 30, 2023
InterSystems IRIS Cloud SQL is a fully managed cloud service that brings the power of InterSystems IRIS relational database capabilities used by thousands of enterprise customers to a broad audience of application developers and data professionals. InterSystems IRIS Cloud IntegratedML is an option to this database-as-a-service that offers easy access to powerful Automated Machine Learning capabilities in a SQL-native form, through a set of simple SQL commands that can easily be embedded in application code to augment them with ML models that run close to the data.
Today, we announce the Developer Access Program for these two offerings. Application developers can now self-register for the service, create deployments and start building composable applications and smart data services, with all provisioning, configuration and administration taken care of by the service.
Developers can take advantage of a free trial that covers a small deployment for a limited time entirely free of charge in order to get started quickly and experience the performance of InterSystems IRIS Cloud technology. Alternatively, customers will be able to subscribe through the AWS marketplace to deploy the full set of instance sizes, which will get invoiced to their AWS account. To complement and support this core relational database service, the InterSystems Cloud team will continue to rapidly roll out additional features and capabilities throughout the Developer Access Program, and is eager to take your feedback to further enhance the user experience.
InterSystems IRIS Cloud SQL, and the Cloud IntegratedML add-on, are foundational services in the InterSystems Cloud portfolio, and complement a number of successful earlier software-as-a-service and platform-as-a-service offerings for the healthcare market. They are building blocks for a composability approach to implement solutions that are easy to provision, scale, and operate in today’s fast-moving technology landscape.
Register for the Developer Access Program today, and start building your next masterpiece!
During the Developer Access Program, deployments can only be created in the AWS us-east-1 region. These terms and conditions apply.
Announcement
Anastasia Dyubaylo · Jun 17, 2020
Hey Developers,
We're pleased to invite you to join the next InterSystems IRIS 2020.1 Tech Talk: Using InterSystems Managed FHIR Service in the AWS Cloud on June 30 at 10:00 AM EDT!
In this InterSystems IRIS 2020.1 Tech Talk, we’ll focus on using InterSystems Managed FHIR Service in the AWS Cloud. We’ll start with an overview of FHIR, which stands for Fast Healthcare Interoperability Resources, and is a next generation standards framework for working with healthcare data.
You'll learn how to:
provision the InterSystems IRIS FHIR server in the cloud;
integrate your own data with the FHIR server;
use SMART on FHIR applications and enterprise identity, such as Active Directory, with the FHIR server.
We will discuss an API-first development approach using the InterSystems IRIS FHIR server. Plus, we’ll cover the scalability, availability, security, regulatory, and compliance requirements that using InterSystems FHIR as a managed service in the AWS Cloud can help you address.
Speakers:🗣 @Patrick.Jamieson3621, Product Manager - Health Informatics Platform, InterSystems 🗣 @Anton.Umnikov, Senior Cloud Solution Architect, InterSystems
Date: Tuesday, June 30, 2020Time: 10:00 AM EDT
➡️ JOIN THE TECH TALK!
Announcement
Cindy Olsen · May 5, 2023
Effective May 16, documentation for versions of InterSystems Caché® and InterSystems Ensemble® prior to 2017.1 will only be available in PDF format on the InterSystems documentation website. Local instances of these versions will continue to present content dynamically.
Announcement
Daniel Palevski · Nov 27, 2024
InterSystems announces General Availability of InterSystems IRIS, InterSystems IRIS for Health, and HealthShare Health Connect 2024.3
The 2024.3 release of InterSystems IRIS® data platform, InterSystems IRIS® for Health, and HealthShare® Health Connect is now Generally Available (GA).
Release Highlights
In this release, you can expect a host of exciting updates, including:
Much faster extension of database and WIJ files
Ability to resend messages from Visual Trace
Enhanced Rule Editor capabilities
Vector search enhancements
and more.
Please share your feedback through the Developer Community so we can build a better product together.
Documentation
Details on all the highlighted features are available through these links below:
InterSystems IRIS 2024.3 documentation, release notes, and the Upgrade Checklist.
InterSystems IRIS for Health 2024.3 documentation, release notes, and the Upgrade Checklist.
Health Connect 2024.3 documentation, release notes, and the Upgrade Checklist.
In addition, check out the upgrade information for this release.
Early Access Programs (EAPs)
There are many EAPs available now. Check out this page and register to those you are interested.
How to get the software?
As usual, Continuous Delivery (CD) releases come with classic installation packages for all supported platforms, as well as container images in Docker container format.
Classic installation packages
Installation packages are available from the WRC's Continuous Delivery Releases page for InterSystems IRIS, InterSystems IRIS for Health, and Health Connect. Additionally, kits can also be found in the Evaluation Services website.
Availability and Package Information
This release comes with classic installation packages for all supported platforms, as well as container images in Docker container format. For a complete list, refer to the Supported Platforms document.
Installation packages and preview keys are available from the WRC's preview download site or through the evaluation services website.
The build number for this Continuous Delivery release is: 2024.3.0.217.0.
Container images are available from the InterSystems Container Registry. Containers are tagged as both "2024.3" or "latest-cd".
Announcement
Daniel Palevski · Mar 26
InterSystems Announces General Availability of InterSystems IRIS, InterSystems IRIS for Health, and HealthShare Health Connect 2025.1
The 2025.1 release of InterSystems IRIS® data platform, InterSystems IRIS® for HealthTM, and HealthShare® Health Connect is now Generally Available (GA). This is an Extended Maintenance (EM) release.
Release Highlights
In this exciting release, users can expect several new features and enhancements, including:
Advanced Vector Search Capabilities
A new disk-based Approximate Nearest Neighbor (ANN) index significantly accelerates vector search queries, yielding sub-second responses across millions of vectors. Access the following exercise to learn more - Vectorizing and Searching Text with InterSystems SQL .
Enhanced Business Intelligence
Automatic dependency analysis in IRIS BI Cube building and synchronization, ensuring consistency and integrity across complex cube dependencies.
Improved SQL and Data Management
Introduction of standard SQL pagination syntax (LIMIT... OFFSET..., OFFSET... FETCH...).
New LOAD SQL command for simplified bulk import of DDL statements.
Enhanced ALTER TABLE commands to convert between row and columnar layouts seamlessly.
Optimized Database Operations
Reduced journal record sizes for increased efficiency.
Faster database compaction, particularly for databases with lots of big string content.
Increased automation when adding new databases to a mirror.
New command-line utility for ECP management tasks.
Strengthened Security Compliance
Support for cryptographic libraries compliant with FIPS 140-3 standards.
Modernized Interoperability UI
Opt-in to a revamped Production Configuration and DTL Editor experience, featuring source control integration, VS Code compatibility, enhanced filtering, split-panel views, and more. Please see this Developer Community article for more information about how to opt-in and provide feedback.
Expanded Healthcare Capabilities
Efficient bulk FHIR ingestion and scheduling, including integrity checks and resource management.
Enhanced FHIR Bulk Access and improved FHIR Search Operations.
New Developer Experience Features
Embedded Python support within the DTL Editor, allowing Python-skilled developers to leverage the InterSystems platform more effectively. Watch the following video to learn more - Using Embedded Python in the BPL and DTL Editors.
Enhanced Observability with OpenTelemetry
Introduction of tracing capabilities in IRIS for detailed observability into web requests and application performance.
Please share your feedback through the Developer Community so we can build a better product together.
Documentation
Details on all the highlighted features are available through these links below:
InterSystems IRIS 2025.1 documentation and release notes.
InterSystems IRIS for Health 2025.1 documentation and release notes.
Health Connect 2025.1 documentation and release notes.
In addition, check out the upgrade impact checklist for an easily navigable overview of all changes you need to be aware of when upgrading to this release.
In particular, please note that InterSystems IRIS 2025.1 introduces a new journal file format version, which is incompatible with earlier releases and therefore imposes certain limitations on mixed-version mirror setups. See the corresponding documentation for more details.
Early Access Programs (EAPs)
There are many EAPs available now. Check out this page and register to those you are interested.
Download the Software
As usual, Extended Maintenance (EM) releases come with classic installation packages for all supported platforms, as well as container images in Docker container format.
Classic Installation Packages
Installation packages are available from the WRC's InterSystems IRIS page for InterSystems IRIS and InterSystems IRIS for Health, and WRC’s HealthShare page for Health Connect. Kits can also be found in the Evaluation Services website.
Availability and Package Information
This release comes with classic installation packages for all supported platforms, as well as container images in Docker container format. For a complete list, refer to the Supported Platforms document.
The build number for this Extended Maintenance release is 2025.1.0.223.0.
Container images are available from the InterSystems Container Registry. Containers are tagged as both "2025.1" and "latest-em".
Discussion
Erik Svensson · Sep 18, 2020
Hello!
First of all, let me state that I am no senior InterSystems expert.
In my organization, we have a HealthShare Health Connect setup where each namespace has one code database and one data database, which are both actively mirrored. We have two nodes in the mirror.
We had a controlled failover last night to make sure that the backup node works as intended, which it didn't. It turned out that we had only deployed code onto the primary node in several namespaces causing errors with missing classes after the failover. So it seems that each time you deploy productions, you have to manually deploy it to both instances (the primary and failover). That makes me wonder:
What is actually mirrored when you mirror a code database
Obviously not new classes
changes to existing classes?
settings on the production adapters?
something else?
How do you guys go about deploying new code?
Are you utilizing some kind of automation tool to keep the mirrored nodes consistent regarding code and versions of code?
Are you just manually deploying to each node and have good routines doing it?
Or do we have some kind of faulty setup which makes this not work as intended?
I don't think our setup is faulty, I think we just missed this a bunch of times which makes me want to abstract this to a way where you deploy to something that deploys the same code to both nodes.
An example: We have 3 environments (production, QA and test), for each of QA and prod, we receive webservice requests from 2 different networks, an internal network and an external one. For each network, we have a pair of web servers running httpd with web gateway. This makes 4 web server hosts for each production and qa environment and in the test environment, we have slimmed this to only have the one pair. Totally 10 web servers. This is bound to be time consuming to maintain and create inconsistency and details that is not done exactly the same between the hosts if you are not extremely thorough, if you would do it manually. So we use Ansible. I have made a playbook and a set of configs for each environment and each network type so each pair is treated exactly the same and the playbook is always used to deploy changes and keep consistency.
I would like to achieve something similar with deploying code to our HeathConnect mirrored instances.
How do you guys do it? #1) check that ALL code Dbs are part of your Mirror. There is a fair chance that not all code you use is in a single code DB but is mapped to other DBs.I'm not talking about implicitly mapped pieces like all System and %* Utilities.#2) If you use code mapping It is highly important that Package mapping AND routine mapping go hand in hand#3) Whatever Mirror is synchronizing is based on Global Journal. So also all code DBs require Journalling. Since every routine or class whether deployed or not is stored in some global.
But my personal preference is not to apply Mirror to code DBs.Mainly to control the point in time when a Change/Update happens. I'm a fan of the Red Fire Button and like to control the moment of essential changes Hi Robert,
Seeing as how I'm fighting the same issue (keeping mirror members synched, not code DBs though), what does the "Red Fire Button" refer to?
Thanks,
Dan
Definitely, something wrong in the configuration. Code in InterSystems in fact is no different from any other data stored there. So, you may have some wrong mappings, or store some of your code in %SYS.
I have a configuration with mirroring + ECP, and it works perfectly, I don't even care which of the nodes is primary, and can switch it any time, with no issues. And I have even more than one Code databases, and more than 20 Data databases. Mirroring Nodes works on 2018.1 while ECP Application Servers on 2012.2, with no issues.
If you have some doubts about your configuration, you can ask for help through WRC, or we can help you with it, we can review your settings, and say what actually happened and how to solve it Hi @Dan.Pahnke The "Red Fire Button" is a synonym I used over the years with my various teams for an action/decisionthat should not be taken by a single person but follows (at least) the 4-eyes-principle.
Inspired by an endless number of Airforce fighting movies from Hollywood andthe old but still incredible song from The Dubliners.And its best cover version The issue was that the general package mappings we had in %ALL namespace was not mirrored and they were not mapped in the backup node.
Question
Ashish Gupta · Feb 1, 2019
Hi All,
Can someone help me getting the security features & standards which InterSystmes Cache adheres to ISO 27001 & other security & privacy standards.
Also if you can tell me the algorithm used for database encryption & key strength by default.
This is required for a security audit.
Thanks in advance.
Ashish Check Cache Security Administration Guide. And also this article. You might be interested in this page:
https://docs.intersystems.com/ens20181/csp/docbook/DocBook.UI.Page.cls?KEY=GCAS_standards
Or potentially some of the documents on this one:
https://www.intersystems.com/gt/
Database encryption uses AES. You select the key size when creating the key; 128, 192, and 256 bits are all options.
If you have a specific question about standards not covered there, I would recommend contacting the WRC.
Announcement
Evgeny Shvarov · Jun 9, 2019
Hi Community!
I have very good news for the developers, who are using GitHub to host projects with InterSystems ObjectScript. GitHub introduced the support of InterSystems ObjectScript this week!
How does it work?
Now all the .cls files in your repository are considered as InterSystems ObjectScript and highlighted according to the language rules of ObjectScript. For example WebTerminal, Samples-Data.
All the credits go to @Dmitry.Maslennikov, who is developing VSCode ObjectScript module and code highlighting of VSCode and GitHub both use TextMate grammar. Dmitry had introduced the PR to Github Linguist which was reviewed by the GitHub community and has been recently approved.
So your repositories with cls will no longer go as strange Apex or TeX but as InterSystems ObjectScript.
Thanks, Dmitry! Hope you'll provide the details on how ObjectScript is supported in GitHub.
Great news!Coloring works, but still APEX, VB, TeX. Commit triggered a recalculation. Great news, congratulations!P.S. Special thanks to Evgeny for pointing out to exactly WebTerminal's analytics class :) I have published a little bit more details. And there you can find info, how to count and highlight the source of MAC, INT, INC files as well. Thank you @Dmitry.Maslennikov and @Evgeny.Shvarov this is great news! You are welcome, Nikita! This was pretty random but must admit that WebTerminal is a really popular app. And looking forward to seeing the commit to make the repo considered as InterSystems ObjectScript application ;) Thanks a lot @Dmitry.Maslennikov !