Have some free text fields in your application that you wish you could search efficiently? Tried using some methods before but found out that they just cannot match the performance needs of your customers? Do I have one weird trick that will solve all your problems? Don’t you already know!? All I do is bring great solutions to your performance pitfalls!

As usual, if you want the TL;DR (too long; didn’t read) version, skip to the end. Just know you are hurting my feelings.

22 11
2 2.6K

The object and relational data models of the Caché database support three types of indexes, which are standard, bitmap, and bitslice. In addition to these three native types, developers can declare their own custom types of indexes and use them in any classes since version 2013.1. For example, iFind text indexes use that mechanism.

12 1
1 2.1K

Overview

Encryption of sensitive data becomes more and more important for applications. For example patient names, SSN, address-data or credit card-numbers etc..

Cache supports different flavors of encryption. Block-level database encryption and data-element encryption. The block-level database encryption protects an entire database. The decryption/encryption is done when a block is written/read to or from the database and has very little impact on the performance.

With data-element encryption only certain data-fields are encrypted. Fields that contain sensitive data like patient data or credit-card numbers. Data-element encryption is also useful if a re-encryption is required periodically. With data-element encryption it is the responsibility of the application to encrypt/decrypt the data.

Both encryption methods leverage the managed key encryption infrastructure of Caché.

The following article describes a sample use-case where data-element encryption is used to encrypt person data.

But what if you have hundreds of thousands of records with an encrypted datafield and you have the need to search that field? Decryption of the field-values prior to the search is not an option. What about indices?

This article describes a possible solution and develops step-by-step a small example how you can use SQL and indices to search encrypted fields.

5 9
1 1.7K

In the previous parts (1, 2) we talked about globals as trees. In this article, we will look at them as sparse arrays.

A sparse array - is a type of array where most values assume an identical value.

In practice, you will often see sparse arrays so huge that there is no point in occupying memory with identical elements. Therefore, it makes sense to organize sparse arrays in such a way that memory is not wasted on storing duplicate values.

In some programming languages, sparse arrays are part of the language - for example, in J, MATLAB. In other languages, there are special libraries that let you use them. For C++, those would be Eigen and the like.

Globals are good candidates for implementing sparse arrays for the following reasons:

8 3
1 1.4K

Hi, folks!

Suppose you have a Caché class with %String property which contains relatively large text (from 10 to 2000 symbols).

The class:

Class Test.Duplicates Extends %Persistent 

{

Property Text As %String (MAXLEN = 2000);

}

And you have thousands of entries.

What are the best options to find entries which are duplicates on this property?

0 26
1 1.3K

Hello IRIS Community,

InterSystems Certification is developing a certification exam for InterSystems IRIS SQL specialists, and if you match the exam candidate description given below, we would like you to beta test the exam. The exam will be available for beta testing on June 9 - 12, 2024 at InterSystems Global Summit 2024, but only for Summit registrants (visit this page to learn more about Certification at GS24). Beta testing will open for all other interested beta testers on June 24, 2024. However, interested beta testers should sign up now by emailing certification@intersystems.com (please let us know if you will be beta testing at Global Summit or in our online proctored environment). The beta testing must be completed by August 2, 2024.

10 5
8 1.2K

Earlier in this series, we've presented four different demo applications for iKnow, illustrating how its unique bottom-up approach allows users to explore the concepts and context of their unstructured data and then leverage these insights to implement real-world use cases. We started small and simple with core exploration through the Knowledge Portal, then organized our records according to content with the Set Analysis Demo, organized our domain knowledge using the Dictionary Builder Demo and finally build complex rules to extract nontrivial patterns from text with the Rules Builder Demo.

This time, we'll dive into a different area of the iKnow feature set: iFind. Where iKnow's core APIs are all about exploration and leveraging those results programmatically in applications and analytics, iFind is focused specifically on search scenarios in a pure SQL context. We'll be presenting a simple search portal implemented in Zen that showcases iFind's main features.

8 1
1 1.2K

An interesting pattern around unique indices came up recently (in internal discussion re: isc.rest) and I'd like to highlight it for the community.

As a motivating use case: suppose you have a class representing a tree, where each node also has a name, and we want nodes to be unique by name and parent node. We want each root node to have a unique name too. A natural implementation would be:

7 8
0 1K

I'm using Cache SQL and want the ability to choose a specific index.

I've boiled the problem down to one table and simplified the query down to

SELECT *
FROM Registration.PatResp
WHERE SchedApptNum=8450022

SchedApptNum is indexed, but instead of using that column, "Show Plan" indicates that it's looping through the entire Registration.PatResp table on Id (the primary key for the table).

I've done a tune-table with no change.

0 6
0 1K
Question
· Mar 5, 2019
PrimaryKey vs Idkey

Just wondering an Insight in the difference between these two indexes

IdKey / PrimaryKey
=================

Property Identifier As %Integer

Index Index1 on Identifier [Idkey]

Index Index2 on Identifier [PrimaryKey]

What's the difference?

1. If I don't have Index1 and only have Index2, then cache does still make its own id.
So how and why do I ever use the PrimaryKey. In Joins ??

1 4
0 1K
Article
· Jul 7, 2017 19m read
Indexing of non-atomic attributes

Quotes (1NF/2NF/3NF)ru:

Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else).
The same value can be atomic or non-atomic depending on the purpose of this value. For example, “4286” can be
  • atomic, if its denotes “a credit card’s PIN code” (if it’s broken down or reshuffled, it is of no use any longer)
  • non-atomic, if it’s just a “sequence of numbers” (the value still makes sense if broken down into several parts or reshuffled)

This article explores the standard methods of increasing the performance of SQL queries involving the following types of fields: string, date, simple list (in the $LB format), "list of <...>" and "array of <...>".

7 0
0 1K

Hi guys!


Unique, PrimaryKey and IDKey?
In what contexts does it apply?

IDKey sets the registry key access to the store.
PrimaryKey, Unique, and IDKey define the uniqueness in the records, but what is correct?

I use everyone? What is the context of each?

1 4
0 790

Hi,

I have a class with around 400k lines and 60 columns. Class storage is Cache SQL storage (Mapped from a global).

I want to create multiple indices on certain fields.

I am familiar with two approaches:

1. Create a new map (Index type) on a pointer global.

2. Create a bitmap index

Which approach is more recommended to be used in the case I described? If there are any other approaches, I will be happy to hear.

Thanks :)

0 11
0 740
Question
· Nov 23, 2017
Indexing null value

Dear community!

I have problem with index NULL value. Unique index doesn't work for this case. If I use insert and one of parameter is "NULL". Message of constraint doesn't appear and row is inserted into table successfully. How Can I use index with NULL?

1 3
0 679
Question
· Jul 16, 2017
How to index a class

Hi

I have two persistent classes defined. Lets call it Parent and Child.

Child class is one of the property of Parent Class.

I would like to define a index on Child class.

So what is the default behaviour I defined a index on a non simple data type member?

Any possibility that I could customized the behaviour ? For example. Child class has three properties.

Could I configure the index to index any combinations of these three properties?

Thanks for your help.

0 1
0 616

Hello Fellow Cache Developers:

Has anyone ever created an index on values of a list property? If so, would you be willing to share an example?

Also, feel free to offer input and suggestions regarding use of indexes on List values.

Here is my database scenario:

Parent Class:

PropertyA - %String

PropertyB - %Integer

Child Class:

PropertyC - %Integer

PropertyD - list of %Integer

Data illustration:

0 2
0 582

Hi!

Consider I have a class Package.Data with Property UniqueStringValue as %String.

I introduced the Index for this property:

 Index ValueIndex on UniqueStringValue [Unique];

It works well. But if I try to check if there is an object with the certain value in code like this:

if ##class(Package.Data).ValueIndexExists(value)  

this expression fails, if value="value", even if there is an instance with instance.UniqueStingValue="Value"

How can I set the index to prevent saving case sensitive values in this class?

0 6
0 563

Our team is reworking an application to use REST services that use the same database as our current ZEN application. One of the new REST endpoints uses a query that ran very slowly when first implemented. After some analysis, we found that an index on one of the fields in the table greatly improved performance (a query that took 35 seconds was now taking a fraction of a second).

3 4
0 452
Question
· Dec 26, 2019
Performant index on date field

Is there a way to get a good performing index on a date field? I have tried various date property indexes and the query plan is always in a pretty high range. Below are query plan result values I have observed:

StartDate > '2019-12-01' --cost = 699168
StartDate = '2019-12-21' --cost 70666
StartDate between '2019-12-21' and '2019-21-28' --cost = 492058

The query plans above were for type %TimeStamp.

0 7
0 507
Question
· Apr 15, 2018
Indexes in Cache Objects

Hi-

I have the following objects

Class A

Property P1 As B

Property P2 As %String

Property P3 As %String

Class B

Property P1 As %String

Can I create an index in Class A based on P1.P1. Basically I want an index of class A by property P1 in class B

I tried creating the following but got a compile error

Index I1 On P1.P1

Thanks

0 2
0 503

I have a class which defines a property as array of %String. Is it possible to index values of this property and use this property in SQL?

I have tried 'Index idx On prop(ELEMENTS)' and then a select from the generated collection table, but this is still orders of magnitude slower than queries to the containing class.

1 2
0 486
Question
· Mar 29, 2019
Ensemble as a Data lake

We have been storing raw messages in a MySQL database for DR and ad hoc purposes. We are thinking of using an Ensemble instance as our data lake instead. We could segregate the source data by namespace or by global. But either way we'll want a custom global to index the data for data retrieval performance purposes.

Anyone else taking this approach? Any feedback?

0 2
0 478