SQL - TOP/LIMIT/FETCH cause query to hang
Why do these clauses affect SQL performance?
select ID from some_table where row_status in ('I','U') order by ID limit 5 - makes the query infinite
select top 10 ID from some_table where row_status in ('I','U') order by ID - the same
select ID from some_table where row_status in ('I','U') order by ID - is fast
Actually there are no rows in the table having row_status 'I' or 'U'.
I asked Gemini and it recommended me rewrite the query as
select ID from table where row_status in ('I','U') order by ID || '' limit 5,
and this helped, but I have never encountered a similar problem with any other database server.
Comments
A couple questions:
1. Do you have an index on row_status? If not, then filtering on that with values that don't exist in the table likely results in a full table scan, though it is odd that the third query would be faster if this is the case.
2. Can you show us the query plans of the four queries? (see https://docs.intersystems.com/iris20253/csp/docbook/Doc.View.cls?KEY=GS… ) This might help us figure out what is going wrong.
Hi Nick,
- Yes I do (double checked)
- The plan is:
SELECT TOP ? ID FROM <schema> . <table> WHERE row_status IN ( ? , ? ) ORDER BY ID /*#OPTIONS {"DynamicSQLTypeList":"10,1,1"} */
Query Plan
Relative Cost = 105.04
• Read extent bitmap <schema>.<table>.%%DDLBEIndex, looping on ID.
• For each row:
- Read master map <schema>.<table>.IDKEY, using the given idkey value.
- Test the IN condition on %SQLUPPER(row_status) and the NOT NULL condition on %SQLUPPER(row_status).
- Output the row.
Thanks for checking that! Is this the query plan for the slow or fast version of the query, or are the query plans identical between them?
This plan is for the slow (to be more precise, infinite) one. The "fast" one (that uses the concatenation trick) looks like:
SELECT TOP ? ID FROM <schema>.<table> WHERE row_status IN ( ? , ? ) ORDER BY ID || '' /*#OPTIONS {"DynamicSQLTypeList":"10,1,1"} */ /*#OPTIONS {"rtpc-utility":1} */ /*#OPTIONS {"rtpc-log-sel":["1^.0390625"]} */
Warning
• Complex condition TOP on ID from table <schema>.<table> will not filter out any rows.
Information
• This query plan was selected based on the runtime parameter values that led to:
Using the outlier selectivity in an IN condition on row_status.
Query Plan
Relative Cost = 620185
• Call module B, which populates temp-file A.
• Read temp-file A, looping on the '_' expression and ID.
• For each row:
- Output the row.
Module: B
• Read index map <schema>.<table>.row_status_idx, looping on %SQLUPPER(row_status) (with a given set of values) and ID (with a TOP condition).
• For each row:
- Add a row to temp-file A, subscripted by the '_' expression and ID,
with no node data.
Comparing the two, it does look like the slower query is performing a full table scan for some reason, which is very odd. What type of index is row_status_idx? If it is not a bitmap index, could you try adding an additional bitmap index on the field (or changing the type of the index to bitmap)?
If it is important, the index was created automatically by the Data Migration Wizard. In the list of indexes, formerly it had type 'index'. I've recreated the index, this time explictly as 'bitmap':
create bitmap index <name> on <schema>.<table>(row_status)
Now I see that it has type 'bitmap'. Unfortunately, that did not work, and the plan hardly differs:
SELECT TOP ? ID FROM <table>.<schema> WHERE row_status IN ( ? , ? ) ORDER BY ID /*#OPTIONS {"DynamicSQLTypeList":"10,1,1"} */
Query Plan
Relative Cost = 105.04
• Read extent bitmap <table>.<schema>.%%DDLBEIndex, looping on ID.
• For each row:
- Read master map <table>.<schema>.IDKEY, using the given idkey value.
- Test the IN condition on %SQLUPPER(row_status) and the NOT NULL condition on %SQLUPPER(row_status).
- Output the row.
Hmm, that's not what I would have expected. I'm going to ask around and see if there's something I'm missing here, but in the meantime can you try running TUNE TABLE on this table and see if that changes the behavior/query plans at all? https://docs.intersystems.com/iris20253/csp/docbook/DocBook.UI.Page.cls…
Done. Neither `collect statistics` nor `tune table` help.
By the way, how normal is this? Index usage = 0 everywhere.png)
Did you also run a rebuild index ?
Just defining is not enough.
Two times :( First this index, then the whole table too. Then I recreated the index as bitmap, rebuilt all indexes again - doesn't work
three more notes (without using the concatenation like `order by ID || ''`):
- row_status in ('I', 'U') order by ID - hangs
- (row_status = 'I') or (row_status = 'U') order by ID - hangs
- (row_status = 'I') order by ID - fast
Since you used the migration wizard, I assume that the problem is due to the primary key ID and the indexes on it.
Can you use the wizard to create a test table with only two fields, ID and row_status, and post the source code of the generated class here? And more: how many records are there in the table?
Hi Vitaly and thanks for the idea. The table contains 3.5M rows
create table test as select row_status from <src>
create bitmap index test_row_status_idx on test(row_status)
select top 10 ID from test where row_status in ('U','I') order by ID
0.579s
So most probably you are right and the table indeed has some issues with indexes
I actually have similar issues since upgrading to 2025.3 in the quries executed by ProductionConfig.zen when you click a business component. I can see a full table scan on the Ens_Util.Log to read the 50 entries for the Logs tab. When there is enough matches, it finishes relatively fast, but for components wit only a few events, it can take up to a minute vefore the browser unfreezes. After removing statistics things seemed fine for a while, but the problem is back
Thanks Theo, I've also realized that IRIS is much more capricious than Postgres and that I will have to spend half my life tuning indexes in it.