Article
· Jul 21 3m read

Speed up your full-text searches with %iFind indexes

Greetings dear community members!

Many of you will remember the NLP capabilities available in IRIS under the name iKnow, which were deprecated not long ago. But...is everything deprecated? NO! A small village resists deprecation: iFind indexes!

And you might be wondering, what are these magnificent indexes for? Well, it's very simple: they index the text in String and Stream columns and dramatically speed up querying.

What are %iFind indices?

These are a type of bitmap-based indexes that map each unique value in a column to a string of bits, indicating for each row whether a given value is available. You can find more details about these types of indexes in the official documentation .

We have two types of indexes in %iFind, the minimum and the basic, which extends the former.

%iFind.Index.Minimal

Supports SQL searches for words and phrases using wildcards, fuzzy searches, and regular expressions. It does not support co-occurrence or positional phrase searches, nor does it support highlighting results.

%iFind.Index.Basic

It supports all the features of the minimal index (SQL wildcard word and phrase searching) and adds support for co-occurrence, positional phrase searching, and result highlighting. This index's features are sufficient for most common full-text searches.

How to define a %iFind index?

It couldn't be simpler. I'll show you with a small example of a development I've implemented to read public tender information in Spain:

Class Inquisidor.Object.Licitacion Extends (%Persistent, %XML.Adaptor) [ DdlAllowed ]
{

Property IdLicitacion As %String(MAXLEN = 200);
Property Titulo As %String(MAXLEN = 2000);
...

Index IndexIdLicitation On IdLicitacion [ PrimaryKey ];
Index IndexTitulo On (Titulo) As %iFind.Index.Basic(INDEXOPTION = 0, LANGUAGE = "es");

As you can see, we have defined the Title property on which we want to perform searches that allow me to find certain tenders. An example of the titles we will have will be the following:

Maintenance and support service for the CIVITAS software application for managing individual health cards for the Regional Health Department of Castile and León.

To define the %iFind index we'll use the %iFind.Index.Basic(INDEXOPTION = 0, LANGUAGE = "es") configuration. In this case, it's a basic index. As you can see, we have several properties that allow us to define how we want our index to work. Let's take a look at the available properties:

  • IGNOREPUNCTUATION: Supports 2 values, 0 and 1. By default it will have the value 1 and will ignore punctuation marks in the text.
  • INDEXOPTION: Also available in 0 and 1. Allows you to specify whether the index will allow lemmatization or decomposition of texts. Due to its large size, it should only be enabled when necessary (value 1).
  • LANGUAGE: to define the dictionary to be used in searches, in our example, Spanish.
  • LOWER: with values 0 or 1. It will allow you to indicate whether the index will be case sensitive, by default it will have the value 0, ignoring it.
  • USERDICTIONARY: to use a user dictionary prior to indexing.

How to use the %iFind index in a query¿Cómo usar el índice %iFind en una consulta?

To use this type of index we must use the following notation:

SELECT * FROM Inquisidor_Object.Licitacion WHERE %ID %FIND search_index(IndexTitulo, ?)

Let's see the index in action. In my example, I have a table with 800,000 public tender records. Let's look at the plan with the traditional LIKE on our table:

Now let's look at the plan using the index:

As you can see, the relative cost between the normal search and the search with the %iFind index is astronomical: 1239110280 for the query without an index versus 8323422 for the indexed query, making it 150 times faster.

If you'd like to see more details about the types of searches that %iFind indexes allow, here 's the associated documentation.

I hope it is useful to you!

Discussion (2)2
Log in or sign up to continue