Question
· Oct 30, 2019

SQL query index using

Hi guys,
I ran into a strange (for me) situation, when I run same query but change the WHERE clause the plan is different and is not connected to the additional condition.
Query that doesn't use the necessary index:

SELECT * FROM Portal.ProductStats ps  left JOIN  Portal.ProductCacheUpdates pcu ON (pcu.Item=ps.Item) WHERE ps.Item=?
Takes 0.4 sec,  doesn't use index on item from Portal.ProductCacheUpdates table

SELECT * FROM Portal.ProductStats ps  left JOIN  Portal.ProductCacheUpdates pcu ON (pcu.Item=ps.Item) WHERE ps.Item=? AND  ps.dateImported<pcu.dateEdited
Takes 0.003 sec, uses index on item from Portal.ProductCacheUpdates table

Can somebody explain? What i'm doing wrong? How should I know if Cache will use the index or not? Check every query plan?

Discussion (9)0
Log in or sign up to continue

Class Portal.ProductStats Extends %Persistent
{

Property Item As %String [ Required ];

Property dateImported As %String [ Required ];

Index pcacheUpds On Item;
}
Class Portal.ProductCacheUpdates Extends %Persistent
{

Property Product As %String [ Required ];

Property Item As %String [ Calculated, SqlComputeCode = { s {*}=$e({Product},1,5)}, SqlComputed ];

Property dateEdited As %String [ Required ];

Index pcacheUpdsUniq On (Product, Item) [ Unique ];

Index pcacheUpdsProd On Item;

}

Cache for Windows (x86-64) 2017.2 (Build 744U) Fri Sep 29 2017 10:58:27 EDT

  1. Instead of
    Property Product As %String Required ];
    Property Item As %String CalculatedSqlComputeCode = { {*}=$e({Product},1,5)}, SqlComputed ];
    

    it is better to use

    Property Product As %String Required ];
    Property Item As %String(MAXLEN 5) [ RequiredSqlComputeCode = { {*}=$e({Product},1,5)}, SqlComputedSqlComputeOnChange = Product ];
    
  2. Check on the version 2017.2 can not, but checked on 2018.1
    SELECT * FROM Portal.ProductStats ps left JOIN  Portal.ProductCacheUpdates pcu ON (pcu.Item=ps.ItemWHERE ps.Item=?
    • ExtentSize=1 (Portal.ProductStats)
      ExtentSize=1000 (Portal.ProductCacheUpdates)
      
      Relative cost  = 1338
      ◾Read master map Portal.ProductStats.IDKEY, looping on ID.
      
      ◾For each row:
      
       Read index map Portal.ProductCacheUpdates.pcacheUpdsProd, using the given %SQLUPPER(Item), and looping on ID.
       For each row:
       Read master map Portal.ProductCacheUpdates.IDKEY, using the given idkey value.
       Generate a row padded with NULL for table Portal.ProductCacheUpdates if no row qualified.
       Output the row.
    • ExtentSize=1000 (Portal.ProductStats)
      ExtentSize=1 (Portal.ProductCacheUpdates)
      Relative cost  = 1219.2
      ◾Read index map Portal.ProductStats.pcacheUpds, using the given %SQLUPPER(Item), and looping on ID.
      
      ◾For each row:
      
       Read master map Portal.ProductStats.IDKEY, using the given idkey value.
       Read index map Portal.ProductCacheUpdates.pcacheUpdsProd, using the given %SQLUPPER(Item), and looping on ID.
       For each row:
       Read master map Portal.ProductCacheUpdates.IDKEY, using the given idkey value.
       Generate a row padded with NULL for table Portal.ProductCacheUpdates if no row qualified.
       Output the row.

      As you can see in both cases the index pcacheUpdsProd is used.

Have you really set up the tables and cleared the cached queries so that the optimizer can start using the new statistics?

Try to do it manually in the terminal:

blablabla>d $system.SQL.TuneSchema("Portal",1), $SYSTEM.SQL.Purge(), $system.Status.DisplayError($system.OBJ.CompilePackage("Portal","cu-d"))

Sergey

The biggest difference between the 2 queries is the second one is really a JOIN, not a Left Outer Join.  By adding the second WHERE condition on a property in pcu you are making this a JOIN and allowing the Optimizer to start in either table.  I created the tables on my system and here is what I am seeing

The first query starts in ps using the item index and then joins to the pcu table using a temp file built from the master map (same as the plan you show)

The second query is starting with the pcu table and the doing a join to the ps table using a temp file build from the Item index and the master map.

Why don't we use the Item index in pcu?  I can't tell you exactly why, but I know the 2 biggest factors we are looking at are the size of the maps on disk and the Selectivity of the properties involved.  In both cases, we are going to need to read the master map so the question is this:  Will reading in the Index map reduce the number of blocks we need to read of the master map to make it worth the effort?  Without proper values generated by TuneTable, or supplied by you, the optimizer is left to guess at what these values are.  In the first case, it has decided that reading the index will not reduce the number of block reads of the master map so it is not using it.  This generally happens when we think the Master Map is very small.

To take this any further I would need you to run TuneTable on both of these tables and see if the poor plan is still used.  If it is still used please provide the storage info for both classes so I can investigate this.