#Indexing

0 Followers · 62 Posts

How to index data structures in databases.

Article Thomas Dyar · Jan 25 14m read

TL;DR: This article demonstrates how to run GraphRAG-style hybrid retrieval—combining vector similarity, graph traversal, and full-text search—entirely within InterSystems IRIS using the iris-vector-graph package. We use a fraud detection scenario to show how graph patterns reveal what vector search alone would miss.


Why Fraud Detection Needs Graphs

Every year, businesses and consumers lose billions to fraud.In 2024 alone, consumers reported $12.5 billion lost—a 25% increase year over year.What makes modern fraud so difficult to detect is that fraudsters rarely work alone.

0
0 17
Article Vachan C Rannore · Oct 21, 2025 3m read

Hello!!!

Data migration often sounds like a simple "move data from A to B task" until you actually do it. In reality, it is a complex process that blends planning, validation, testing, and technical precision.

Over several projects where I handled data migration into a HIS which runs on IRIS (TrakCare), I realized that success comes from a mix of discipline and automation.

Here are a few points which I want to highlight.

1. Start with a Defined Data Format.

Before you even open your first file, make sure everyone, especially data providers, clearly understands the exact data format you expect. Defining templates early avoids unnecessary bank-and-forth and rework later. 

While Excel or CSV formats are common, I personally feel using a tab-delimited text file (.txt) for data upload is best. It's lightweight, consistent, and avoids issues with commas inside text fields. 

PatID   DOB Gender  AdmDate
10001   2000-01-02  M   2025-10-01
10002   1998-01-05  F   2025-10-05
10005   1980-08-23  M   2025-10-15

Make sure that the date formats given in the file is correct and constant throughout the file because all these files are usually converted from an Excel file and an Basic excel user might make mistakes while giving you the date formats wrong. Wrong date formats can irritate you while converting into horolog.

12
2 227
Article Karthickraja S · Dec 18, 2025 2m read

The Power of Indexing in Database Tables

When working with databases, most developers understand the concept of an index and why it's used: to speed up data retrieval. But the real impact of indexing often becomes clear only when you compare scenarios with and without it.

Do you Know what Happens Without an Index?
Imagine a table with three columns: Name, Age, and MobileNumber.


Now, consider this query:

If the Age column does not have an index, the database engine will:

  • Check if the WHERE condition field has an index.
  • If not, it will scan the entire table (a full table scan).
  • For each row, it
3
2 112
Article Timothy Scott · Feb 28, 2025 7m read

High-Performance Message Searching in Health Connect

The Problem

Have you ever tried to do a search in Message Viewer on a busy interface and had the query time out? This can become quite a problem as the amount of data increases. For context, the instance of Health Connect I am working with does roughly 155 million Message Headers per day with 21 day message retention. To try and help with search performance, we extended the built-in SearchTable with commonly used fields in hopes that indexing these fields would result in faster query times. Despite this, we still couldn't get some of these queries to finish at all.

1
8 298
Question Christine Nyamu · Oct 22, 2025

Hello. I need to transform a message

FROM:

MSH|^~\&|
SCH||61490||
PID|1||
RGS|1||1
AIS|1||
AIS|2||
AIS|3||
AIL|1||
AIP|1||

TO:

MSH|^~\&|
SCH||61490||
PID|1||
RGS|1||1
AIS|1||
AIL|1||
AIP|1||
RGS|1||1
AIS|2||
AIL|1||
AIP|1||
RGS|1||1
AIS|3||
AIL|1||
AIP|1||

The RGS, AIS, AIL and AIP are all under the RGS group. The one RGS segment that comes in will be copied across the group. If 3 AIS segments come in then I need 3 RGS groups, if 2 I need 2 RGS groups etc.

In my DTL (screenshot below) I have currently hardcoded the RGS index (1,2,3) but this will not be sufficient incase 4 AIS segments are sent in.

5
0 108
Article Harshitha · Oct 22, 2025 2m read

Hello community,

I wanted to share my experience about working on Large Data projects. Over the years, I have had the opportunity to handle massive patient data, payor data and transactional logs while working in an hospital industry. I have had the chance to build huge reports which had to be written using advanced logics fetching data across multiple tables whose indexing was not helping me write efficient code.

Here is what I have learned about managing large data efficiently.

Choosing the right data access method.

As we all here in the community are aware of, IRIS provides multiple ways to access data. Choosing the right method, depends on the requirement.

  • Direct Global Access: Fastest for bulk read/write operations. For example, if i have to traverse through indexes and fetch patient data, I can loop through the globals to process millions of records. This will save a lot of time.
Set ToDate=+HSet FromDate=+$H-1ForSet FromDate=$O(^PatientD("Date",FromDate)) Quit:FromDate>ToDate  Do
. Set PatId=""ForSet PatId=$Order(^PatientD("Date",FromDate,PatID)) Quit:PatId=""Do
. . Write$Get(^PatientD("Date",FromDate,PatID)),!
  • Using SQL: Useful for reporting or analytical requirements, though slower for huge data sets.
6
1 179
Article Benjamin De Boe · Jun 19, 2025 10m read

This article describes a significant enhancement of how InterSystems IRIS deals with table statistics, a crucial element for IRIS SQL processing, in the 2025.2 release. We'll start with a brief refresher on what table statistics are, how they are used, and why we needed this enhancement. Then, we'll dive into the details of the new infrastructure for collecting and saving table statistics, after which we'll zoom in onto what the change means in practice for your applications. We'll end with a few additional notes on patterns enabled by the new model, and look forward to the follow-on phases of this initial delivery.

6
6 402
Article Guillaume Rongier · Jul 28, 2025 3m read

img

This will be a short article about Python dunder methods, also known as magic methods.

What are Dunder Methods?

Dunder methods are special methods in Python that start and end with double underscores (__). They allow you to define the behavior of your objects for built-in operations, such as addition, subtraction, string representation, and more.

Some common dunder methods include:

  • __init__(self, ...): Called when an object is created.
    • Like our %OnNew method in ObjectScript.
  • __str__(self): Called by the str() built-in function and print to represent the object as a string.
  • __repr__(self):
0
2 198
Question Jean Millette · Jun 3, 2025

Hello,

I have a class with a "Unique" index (pxfactidIndex) on a %Numeric property (pxfactid) (partially-edit code snippet below):

Property pxfactid As%Library.Numeric(MAXVAL = 9223372036854775807, MINVAL = -9223372036854775808, SCALE = 0) [ SqlColumnNumber = 7 ];
Index pxfactidIndex On pxfactid [ Unique ];
Storage Default
{
<Data name="FactDefaultData">
<Value name="1">
<Value>pysubjectid</Value>
</Value>
...
<Value name="6">
<Value>pxfactid</Value>
</Value>
...
</Data>
<DataLocation>^CRMBI.FactD</DataLocation>
<DefaultData>FactDefaultData</DefaultData>
<ExtentLocation>^CRMBI.F
8
0 162
Question Krishnamuthu Venkatachalam · Mar 10, 2025

Hi all,

I'm working on a requirement to loop through all encounter streamlets(SDA) to identify specific encounters based on an encounter extension property for a patient fetch request. However, this current process is time-consuming, and we need to create indexes for that property to quickly retrieve the expected results without going through all the encounter streamlets of a patient.

I would appreciate help on how to achieve this, as I couldn't find any documentation explaining how to create indexes on a SDA element.

Thanks in advance.

1
0 137
Article Joe Fu · Mar 7, 2025 2m read

We recently changed the 'UserID" property in a "User" class from type of %String to be %Library.Username. This is for better consistency across our codebase regarding MAXLEN limit.

%Library.Username is a system wrapper datatype which extends %String and has a MAXLEN of 160. This change should have minimal/no impact on code behavior. However, we found that some SQL query cannot return expected rows after the change. Query will return empty values even if the entry is in the table.

After investigation, we found that re-building index could solve the issue, which means the data type change caused

3
0 237
Discussion Nick Petrocelli · Jun 21, 2024

Hello everyone, 

My team is currently developing guidance and best practices for the generation, storage, and deployment of TUNE TABLE statistics across development and production environments. With that in mind, we want to get an idea of what methods teams in the field have developed for handling this data. In particular, we’d like to know the following: 

  1. How often do you use TUNE TABLE in your development vs. production environments? 
  2. Do you utilize the $SYSTEM.SQL.Stats.Table package to generate and export TUNE TABLE statistics as files? If so: 
    1. Do you store these in source control? 
    2. W
3
1 295
Question Hannah Sullivan · Dec 30, 2024

I have a unique index on two properties in a class, and I want to open the instance by index key with one of the two properties is set to the empty string. Is this possible to perform?

Take the example class here

Class TestIndex
{

Property Mandatory as TestClass [Required];Property Alternate as TestClass;Property AltOrEmpty As%String [ Calculated, Required, SqlComputeCode = {Set {*} = $Case({Alternate},"":$c(0),:{Alternate})}, SqlComputed ];
Index AlternateMandatory on (Mandatory, AltOrEmpty) [ Unique ];
}

And example data as

Mandatory Alternate AltOrNUL
2    
2 3
2
0 152
Article Robert Cemper · Jul 8, 2023 2m read

A recent question from @Vivian Lee reminded me of a rather ancient example.
It was the time when DeepSee's first version was released.
We got Bitmap Index.
And we got BitSlice Index: mapping a numeric value by its binary parts.
So my idea: Why not indexing strings by their characters?
The result of this idea was presented first in June 2008. 
IKnow wasn't publicly available at that time.

1
1 249
InterSystems Official Thomas Dyar · Oct 3, 2024

We've recently made available a new version of InterSystems IRIS in the Vector Search Early Access Program, featuring a new Approximate Nearest Neighbor index based upon the Hierarchical Navigable Small World (HNSW) indexing algorithm. This addition allows for highly efficient, approximate nearest-neighbor searches over large vector datasets, dramatically improving query performance and scalability.

The HNSW algorithm is designed to optimize vector search for high-dimensional data by building a graph-based structure, making it faster to find approximate neighbors in large collections of

1
1 355
Article Timothy Leavitt · Feb 21, 2024 9m read

Suppose you have an application that allows users to write posts and comment on them. (Wait... that sounds familiar...)

For a given user, you want to be able to list all of the published posts with which that user has interacted - that is, either authored or commented on. How do you make this as fast as possible?

Here's what our %Persistent class definitions might look like as a starting point (storage definitions are important, but omitted for brevity):

Class DC.Demo.Post Extends%Persistent
{

Property Title As%String(MAXLEN = 255) [ Required ];Property Body As%String(MAXLEN = "") [
3
5 558
Question Ashok Kumar T · Jul 20, 2024

Hello Community,

As per the Build index documentation "If you use BUILD INDEX on a live system, the index is temporarily labeled as notselectable, meaning that queries cannot use the index while it is being built. Note that this will impact the performance of queries that use the index." Is this  hiding/not selectable is only applicable for BUILD INDEX or it supports class level %BuildIndices as well. as far as my analysis both syntax setting this setting SetMapSelectability

Thanks!

3
0 208
Announcement Shane Nowack · Apr 22, 2024

Hello IRIS Community,

InterSystems Certification is developing a certification exam for InterSystems IRIS SQL specialists, and if you match the exam candidate description given below, we would like you to beta test the exam. The exam will be available for beta testing on June 9 - 12, 2024 at InterSystems Global Summit 2024, but only for Summit registrants (visit this page to learn more about Certification at GS24). Beta testing will open for all other interested beta testers on June 24, 2024. However, interested beta testers should sign up now by emailing certification@intersystems.com (please let us know if you will be beta testing at Global Summit or in our online proctored environment). The beta testing must be completed by August 2, 2024.

5
7 1493
Article Mihoko Iijima · Aug 31, 2023 1m read

InterSystems FAQ rubric

By specifying the start and end values ​​of the IDs for which you want to rebuild indexes in the arguments of the %BuildIndices() method provided in the persistent class (=table) definition, you can rebuild only the indexes within that range.

For example, to rebuild the NameIDX index and ZipCode index in the Sample.Person class only for ID=10 to 20, execute the following code (the ID range is specified in the 5th and 6th arguments). 

 set status = ##class(Sample.Person).%BuildIndices($LB("NameIDX","ZipCode"),1,,1,10,20

$LB() is the $ListBuild() function. The

0
0 643
Article Mihoko Iijima · Jun 29, 2023 3m read

InterSystems FAQ rubric

For volatile tables (tables with many INSERTs and DELETEs), storage for bitmap indexes can become inefficient over time.

For example, suppose that there are thousands of data with the following definition, and the operation of bulk deletion with TRUNCATE TABLE after being retained for a certain period of time is repeatedly performed.

Class MyWork.MonthData Extends (%Persistent, %Populate)
{
/// Level of satisfaction
Property Satisfaction As %String(VALUELIST = ",満足,やや満足,やや不満,不満,");
/// Age
Property Age As %Integer(MAXVAL = 70, MINVAL = 20);
Index AgeIdx On Age [
0
0 379
Article Timothy Leavitt · Jun 28, 2022 2m read

An interesting pattern around unique indices came up recently (in internal discussion re: isc.rest) and I'd like to highlight it for the community.

As a motivating use case: suppose you have a class representing a tree, where each node also has a name, and we want nodes to be unique by name and parent node. We want each root node to have a unique name too. A natural implementation would be:

Class DC.Demo.Node Extends%Persistent
{

Property Parent As DC.Demo.Node;Property Name As%String [ Required ];
Index ParentAndName On (Parent, Name) [ Unique ];
Storage Default
{
<Data name="NodeDe
8
0 1258
Question Lukas Dolezal · Apr 10, 2022

I want to store data in an index global without defining an index in an inherited class.
Example:

Class Aclass [ Abstract ]
{
Index TXSBI On TextSearch(KEYS);
Index TXSSI On TextSimilarity(KEYS) [ Data = TextSimilarity(ELEMENTS) ];
Property TextSearch As %Text(LANGUAGECLASS = "%ZText.Czech", MAXLEN = 1000, XMLPROJECTION = "NONE");
Property TextSimilarity As %Text(LANGUAGECLASS = "%ZText.CzechSim", MAXLEN = 1000, SIMILARITYINDEX = "TXSSI", XMLPROJECTION = "NONE");
...
//other code
Class AAclass Extends (%Persistent, ClassType.SuperClass, Aclass, Bclass...)
{

}

Is there a way to inherit

5
0 662
Question José Pereira · Apr 7, 2022

Hi!

I'd like to know if there are any issues if an index is inserted into a table without running the %BuildIndices() method.

It's important to note that data inserted before the index is not important for retrieval, so it's not a problem data inserted before the index don't show up in queries.

The reason why I'm asking this is that I'd like to avoid index reconstruction on big tables which I need to inser such index.

I'm using Cache 2018.1.

Thanks,

José

8
0 498
Article José Pereira · Feb 2, 2021 12m read

Image search like Google's is a nice feature that wonder me - as almost anything related to image processing.

A few months ago, InterSystems released a preview for Python Embedded. As Python has a lot of libs for deal with image processing, I decided to start my own attemptive to play with a sort of image search - a much more modest version in deed :-)


---

A tast of theory 🤓

In order to do an image search system, fist it's necessary select a set of features to be extracted from images - these features are also called descriptors.

0
0 442
Article Allyson Gerace · Feb 6, 2019 8m read

See Part 1 here.

Part 2: Index Handling

Now you have a good idea of what kind of indices you need for your class and how to define them. Next, how do you handle them?

Query Plan

(REMEMBER: Like any modifications to a class, adding indices in a live system has its risks – if users are accessing or updating data while an index is populated, they may encounter empty or incorrect query results, or even corrupt the indices that are being built.

1
0 1922
Article Allyson Gerace · Feb 6, 2019 13m read

This is the first in a pair of articles on SQL indices.

Part 1 - Know your indices

What is an index, anyway?

Picture the last time you went to a library. Typically they have books sorted by subject matter (and then author and title), and each shelf has an end-plate with a code describing the subject of its books. If you wanted to collect books of a certain subject, instead of walking across every aisle and reading the inside cover of every book, you could head straight for the bookshelf labelled with your desired subject matter and choose your books.

A SQL index has the same general function:

2
6 2278