Index Recommended Approaches

Hi,

I have a class with around 400k lines and 60 columns. Class storage is Cache SQL storage (Mapped from a global).

 I want to create multiple indices on certain fields.

I am familiar with two approaches:

1. Create a new map (Index type) on a pointer global.

2. Create a bitmap index

Which approach is more recommended to be used in the case I described? If there are any other approaches, I will be happy to hear.

Thanks :)

 

  • 0
  • 0
  • 414
  • 8
  • 3

Answers

Documentation says following:

For a table containing more than 1 million records, a bitmap index is less efficient than a standard index when the number of unique values exceeds 10,000. Therefore, for a large table it is recommended that you avoid using a bitmap index for any field that is contains (or is likely to contain) more than 10,000 unique values; for a table of any size, avoid using a bitmap index for any field that is likely to contain more than 20,000 unique values. These are general approximations, not exact numbers.

I created a bitmap index for a field that has around 50 unique values, but when I run SQL query on that field it is failing to perform due to timeout. What can be the reason for it?

Many thanks!

I would suggest you to run the query in terminal in $system.SQL.Shell() or using external tools via JDBC (DBVisualizer, Squirrel). By default Management Portal timeouts when query takes longer than 60 seconds.

Also, see Caché SQL Optimization Guide, especially section "Interpreting an SQL Query Plan". Maybe the chosen plan is not optimal.

Did you look at the Show Plan to see if we are using the Bitmap index you defined?  I am going to guess the answer is no.

 

Can you show us how you defined the bitmap index?  If the property is defined as a string you should change the collation to Exact, or define the collation in the index.

If you have a look at The Art of Mapping Globals to Classes (5 of 3) there should be an example of defining a bitmap index in Cache SQL Storage.

 

 

 

Hi Brendan,

I did check the query plan now, and bitmap index does not seem to work.

Based on your post, I made the following changes:

  • Changed the type of IDKey property to be integer (It indeed contains numbers only),
  • Changed the Index property collation to Exact
    • Indexed property is a status code that contain around 10 unique values

but query plan remains as is.

What else can I be missing here? 

*I don't seem to be able to upload the class here, but I sent it to your email.

Changing IDKey to %Integer is not enough for Bitmap Indexing

You also have to set MINVAL =1  to prevent 0 or negative integers !

I used this rule of thumb:

selectivity > 8 %  candidate for bitmap index

selectivity < 2 % normal index

in between it's a mater of investigation and other side conditions beyond selectivity

By 'Normal index',  you mean the second approach I mentioned (Index data map on a pointer)?

Thanks for sending the class to me Yaniv.  

 

Life is always easier when you don't have to guess what is happening. 

 

The 2 options list above are not really options, but rather the steps you should take when defining any type of index for Cache SQL Storage.  Defining the Index def helps us correctly report info to external databases, but it is the map in the storage that the query optimizer is looking for, so #1 is required for all types of indices.  

Here is the property and index

 

Property StatusCode As %String(COLLATION "EXACT");

Index StatusCodeIdx On StatusCode [ Type = bitmap ];

 

When you want to define a bitmap you do not add the IDKey as a subscript, it will be generated in the data.  For Yaniv's class this is what the bitmap storage def will look like:

 

<SQLMap name="StatusCodeMap">
<Global>^SPMORDP</Global>
<Subscript name="1">
<Expression>2</Expression>
</Subscript>
<Subscript name="2">
<Expression>{StatusCode}</Expression>
</Subscript>
<Type>bitmap</Type>
</SQLMap>

 

If you do not want to use EXACT collation then the Collation that is defined in the Property needs to match the collation defined in the map.  If no collation is listed for a property the default is SQLUPPER, so when the Property looks like this:

Property StatusCode As %String;

 

the Map needs to have a Subscript that looks like this:

<Subscript name="2">
<Expression>$$SQLUPPER({StatusCode})</Expression>
</Subscript>

 

Also when ever you define a bitmap in a class you should also define a Bitmap Extent.  The class compiler does this automatically for Default Storage, but for Cache SQL Storage you need to define the extent map.  The <Type> is bitmapextent and no fields are listed as subscripts:

 

<SQLMap name="BitmapExtent">
<Global>^SPMORDP</Global>
<Subscript name="1">
<Expression>3</Expression>
</Subscript>
<Type>bitmapextent</Type>
</SQLMap>

 

 

If you are using Objects or SQL to modify the data then these indices will be maintained by the generated class code.  If you application is still doing Global sets and kills then you will need to write code to maintain the indices.  You can look at the class Mapping.BitMapExample that is in The Art of Mapping Globals to Classes (5 of 3) to see what that code would look like.

 

If you have any questions please let me know.

 

Here is my version of Yaniv's class

 

Brendan