Article

Kyle Baxter · Sep 9, 2016 5m read

Free Text Search: The Way To Search Your Text Fields That SQL Developers Are Hiding From You!*

Have some free text fields in your application that you wish you could search efficiently? Tried using some methods before but found out that they just cannot match the performance needs of your customers? Do I have one weird trick that will solve all your problems? Don’t you already know!? All I do is bring great solutions to your performance pitfalls!

As usual, if you want the TL;DR (too long; didn’t read) version, skip to the end. Just know you are hurting my feelings.

#Best Practices #iFind #Indexing #Object Data Model #ObjectScript #SQL #Caché

22 11

2 2.7K

Article

Alexander Koblov · Jan 29, 2016 9m read

Creating a Custom Index Type in Caché

The object and relational data models of the Caché database support three types of indexes, which are standard, bitmap, and bitslice. In addition to these three native types, developers can declare their own custom types of indexes and use them in any classes since version 2013.1. For example, iFind text indexes use that mechanism.

#Best Practices #Databases #Indexing #Object Data Model #SQL #Caché

12 1

1 2.2K

Article

Allyson Gerace · Feb 6, 2019 13m read

Know Your Indices

This is the first in a pair of articles on SQL indices.

Part 1 - Know your indices

#Best Practices #Indexing #Performance #SQL #Caché #InterSystems IRIS

8 2

6 2.1K

Article

Vitaliy Serdtsev · Jun 29, 2017 6m read

SQL index for array property elements

Sometimes, it comes in very handy (especially for the EAV model) to use array properties in a class and be able to qickly search by their elements: both the key and the value.

Let’s take a look at a simple example:

#Indexing #ObjectScript #SQL #Caché

3 0

1 2.1K

Article

Michael Braam · Feb 20, 2017 14m read

Making encrypted datafields SQL-searchable

Overview

Encryption of sensitive data becomes more and more important for applications. For example patient names, SSN, address-data or credit card-numbers etc..

Cache supports different flavors of encryption. Block-level database encryption and data-element encryption. The block-level database encryption protects an entire database. The decryption/encryption is done when a block is written/read to or from the database and has very little impact on the performance.

With data-element encryption only certain data-fields are encrypted. Fields that contain sensitive data like patient data or credit-card numbers. Data-element encryption is also useful if a re-encryption is required periodically. With data-element encryption it is the responsibility of the application to encrypt/decrypt the data.

Both encryption methods leverage the managed key encryption infrastructure of Caché.

The following article describes a sample use-case where data-element encryption is used to encrypt person data.

But what if you have hundreds of thousands of records with an encrypted datafield and you have the need to search that field? Decryption of the field-values prior to the search is not an option. What about indices?

This article describes a possible solution and develops step-by-step a small example how you can use SQL and indices to search encrypted fields.

#Encryption #Indexing #Object Data Model #SQL #Caché

5 9

2 1.8K

Article

Allyson Gerace · Feb 6, 2019 8m read

Index Handling

See Part 1 here.

Part 2: Index Handling

#Best Practices #Indexing #Performance #SQL #Caché #InterSystems IRIS

6 1

0 1.8K

Article

Sergey Kamenev · Jul 7, 2017 7m read

Globals - Magic swords for storing data. Sparse arrays. Part 3.

In the previous parts (1, 2) we talked about globals as trees. In this article, we will look at them as sparse arrays.

A sparse array - is a type of array where most values assume an identical value.

In practice, you will often see sparse arrays so huge that there is no point in occupying memory with identical elements. Therefore, it makes sense to organize sparse arrays in such a way that memory is not wasted on storing duplicate values.

In some programming languages, sparse arrays are part of the language - for example, in J, MATLAB. In other languages, there are special libraries that let you use them. For C++, those would be Eigen and the like.

Globals are good candidates for implementing sparse arrays for the following reasons:

#Beginner #Data Model #Globals #Indexing #Key Value #Performance #Relational Tables #Caché #InterSystems IRIS

8 3

1 1.5K

Article

Timothy Leavitt · Jun 28, 2022 2m read

Unique indices and null values in InterSystems IRIS

An interesting pattern around unique indices came up recently (in internal discussion re: isc.rest) and I'd like to highlight it for the community.

As a motivating use case: suppose you have a class representing a tree, where each node also has a name, and we want nodes to be unique by name and parent node. We want each root node to have a unique name too. A natural implementation would be:

#Indexing #SQL #InterSystems IRIS

7 8

0 1.1K

Article

Benjamin De Boe · Jun 28, 2016 7m read

iKnow demo apps (part 5) - iFind search portal

Earlier in this series, we've presented four different demo applications for iKnow, illustrating how its unique bottom-up approach allows users to explore the concepts and context of their unstructured data and then leverage these insights to implement real-world use cases. We started small and simple with core exploration through the Knowledge Portal, then organized our records according to content with the Set Analysis Demo, organized our domain knowledge using the Dictionary Builder Demo and finally build complex rules to extract nontrivial patterns from text with the Rules Builder Demo.

This time, we'll dive into a different area of the iKnow feature set: iFind. Where iKnow's core APIs are all about exploration and leveraging those results programmatically in applications and analytics, iFind is focused specifically on search scenarios in a pure SQL context. We'll be presenting a simple search portal implemented in Zen that showcases iFind's main features.

#iFind #Indexing #SQL #InterSystems Natural Language Processing (NLP, iKnow)

Open Exchange app

8 1

1 1.2K

Article

Vitaliy Serdtsev · Jul 7, 2017 19m read

Indexing of non-atomic attributes

Quotes (1NF/2NF/3NF)^ru:

Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else).
The same value can be atomic or non-atomic depending on the purpose of this value. For example, “4286” can be

atomic, if its denotes “a credit card’s PIN code” (if it’s broken down or reshuffled, it is of no use any longer)

non-atomic, if it’s just a “sequence of numbers” (the value still makes sense if broken down into several parts or reshuffled)

This article explores the standard methods of increasing the performance of SQL queries involving the following types of fields: string, date, simple list (in the $LB format), "list of <...>" and "array of <...>".

#Indexing #Object Data Model #ObjectScript #Performance #SQL #Caché

7 0

0 1.1K

Article

Jean Millette · Aug 22, 2019 3m read

A Case for Thawing Frozen Query Plans After Upgrade

Our team is reworking an application to use REST services that use the same database as our current ZEN application. One of the new REST endpoints uses a query that ran very slowly when first implemented. After some analysis, we found that an index on one of the fields in the table greatly improved performance (a query that took 35 seconds was now taking a fraction of a second).

#Indexing #Performance #SQL #Caché #InterSystems IRIS

3 4

0 504

Article

Timothy Leavitt · Feb 21, 2024 9m read

Functional indices for lightning-fast queries on many-to-many relationship tables

Suppose you have an application that allows users to write posts and comment on them. (Wait... that sounds familiar...)

For a given user, you want to be able to list all of the published posts with which that user has interacted - that is, either authored or commented on. How do you make this as fast as possible?

Here's what our %Persistent class definitions might look like as a starting point (storage definitions are important, but omitted for brevity):

#Best Practices #Indexing #ObjectScript #SQL #InterSystems IRIS

11 3

5 475

Article

Mihoko Iijima · Aug 31, 2023 1m read

How to rebuild index by ID

InterSystems FAQ rubric

By specifying the start and end values of the IDs for which you want to rebuild indexes in the arguments of the %BuildIndices() method provided in the persistent class (=table) definition, you can rebuild only the indexes within that range.

#Indexing #Object Data Model #Relational Tables #SQL #Tips & Tricks #Caché #InterSystems IRIS #InterSystems IRIS for Health

5 0

0 512

Article

José Pereira · Feb 2, 2021 12m read

A custom SQL index with Python features

Image search like Google's is a nice feature that wonder me - as almost anything related to image processing.

A few months ago, InterSystems released a preview for Python Embedded. As Python has a lot of libs for deal with image processing, I decided to start my own attemptive to play with a sort of image search - a much more modest version in deed :-)

#Embedded Python #Indexing #Multi-model #SQL #InterSystems IRIS

Open Exchange app

4 0

0 405

Article

Mihoko Iijima · Jun 29, 2023 3m read

How to compress (maintain) bitmap indexes for volatile tables

InterSystems FAQ rubric

For volatile tables (tables with many INSERTs and DELETEs), storage for bitmap indexes can become inefficient over time.

For example, suppose that there are thousands of data with the following definition, and the operation of bulk deletion with TRUNCATE TABLE after being retained for a certain period of time is repeatedly performed.

#Indexing #SQL #Tips & Tricks #Caché #Ensemble #InterSystems IRIS #InterSystems IRIS for Health

3 0

0 314

Article

Benjamin De Boe · Jun 19 10m read

Towards Smarter Table Statistics

This article describes a significant enhancement of how InterSystems IRIS deals with table statistics, a crucial element for IRIS SQL processing, in the 2025.2 release. We'll start with a brief refresher on what table statistics are, how they are used, and why we needed this enhancement. Then, we'll dive into the details of the new infrastructure for collecting and saving table statistics, after which we'll zoom in onto what the change means in practice for your applications. We'll end with a few additional notes on patterns enabled by the new model, and look forward to the follow-on phases of this initial delivery.

#Indexing #Performance #Relational Tables #SQL #InterSystems IRIS

11 5

3 120

Article

Robert Cemper · Jul 8, 2023 2m read

Character-Slice Index

A recent question from @Vivian Lee reminded me of a rather ancient example.
It was the time when DeepSee's first version was released.
We got Bitmap Index.
And we got BitSlice Index: mapping a numeric value by its binary parts.
So my idea: Why not indexing strings by their characters?
The result of this idea was presented first in June 2008.
IKnow wasn't publicly available at that time.

#Indexing #InterSystems IRIS

Open Exchange app

2 1

1 200

Article

Joe Fu · Mar 7 2m read

SQL Query Stopped Working After Changing %String Property to Wrapper Types

We recently changed the 'UserID" property in a "User" class from type of %String to be %Library.Username. This is for better consistency across our codebase regarding MAXLEN limit.

%Library.Username is a system wrapper datatype which extends %String and has a MAXLEN of 160. This change should have minimal/no impact on code behavior. However, we found that some SQL query cannot return expected rows after the change. Query will return empty values even if the entry is in the table.

#Indexing #Namespace #SQL #HealthShare #InterSystems IRIS

2 3

0 78

Article

Timothy Scott · Feb 28 7m read

High-Performance Message Searching in Health Connect

The Problem

Have you ever tried to do a search in Message Viewer on a busy interface and had the query time out? This can become quite a problem as the amount of data increases. For context, the instance of Health Connect I am working with does roughly 155 million Message Headers per day with 21 day message retention. To try and help with search performance, we extended the built-in SearchTable with commonly used fields in hopes that indexing these fields would result in faster query times. Despite this, we still couldn't get some of these queries to finish at all.

#Big Data #HL7 #Indexing #Interoperability #Message Search #Performance #SQL #Tips & Tricks #Health Connect

17 0

6 113