Long Running SQL Queries: a sample exploration

Article

Ben Schlanger · May 7 4m read

#Performance #Relational Tables #SQL #Tips & Tricks #InterSystems IRIS

Here at InterSystems, we often deal with massive datasets of structured data. It’s not uncommon to see customers with tables spanning >100 fields and >1 billion rows, each table totaling hundred of GB of data. Now imagine joining two or three of these tables together, with a schema that wasn’t optimized for this specific use case. Just for fun, let’s say you have 10 years worth of EMR data from 20 different hospitals across your state, and you’ve been tasked with finding….
every clinician within your network
      who has administered a specific drug
         between the years of 2017-2019
            to patients who reside outside the state
                and have one of the following conditions [diabetes, hypertension, asthma]
                    where the cost was covered by Medicaid

I’ve seen our technology handle these sort of cases just fine, but the query may still take a while to run. Can it be faster though? Let me walk you through a sample investigation.

///////////////////////////////////////////////////////////////////////////////////////////////

The Need:
Find all patients who have had an outpatient encounter at a facility located in one of these counties in the year 2022 or 2023

The Query:
SELECT DISTINCT enc.Patient->PatientNumber
FROM EMR.Encounter as enc
INNER JOIN State_Facility.Address as fa on enc.Facility = fa.FacilityCode
INNER JOIN State_Geography.Cities as city ON city.Zip = fa.ZipCode
WHERE enc.EncounterTime BETWEEN '2022-01-01' AND '2023-12-31'
AND enc.EncounterType IN ('OP','Outpatient','O')
AND city.County IN ('Los Angeles County', 'Orange County', 'Riverside County', 'San Bernardino County', 'Ventura County')

The Performance:
The query was taking >24 hours to complete

INVESTIGATION STEPS:

1) Review the tables that you’re querying. What relationships or foreign keys exist between them? What indices already exist? Is your SQL query making good use of the ones that already exist? Do the indices have Status = Selectable?

We checked each field that was part of a WHERE, AND, or INNER JOIN. Most of them did have indices, including some bitmap indices. [NOTE: Further to the right of the screenshot page, the Status column shows that EncounterTypeIndex is Selectable]

2) Review the Query Plan. Does it make sense? Does it make use of the indices and relationships you expected it would? If not, does it seem more or less efficient?

Yes, the Query Plan showed effective use of the indices on EncounterType and StartTime. [NOTE: This screenshot is for a simplified version of the query that does not consider the zip code of the encounter facility]

3) Ensure the table statistics up to date by running Tune Tables

4) Check whether the actual Query Plan at runtime matches the one you were shown. The "Show Plan" Query Plan does not utilize the Runtime Plan Choice (RTPC) optimization when it generates a Query Plan, but the RTPC is utilized when the query is actually run. That is why the Show Plan Query Plan and the runtime Query Plan can be different. The RTPC algorithm usually finds an optimal choice, but it can sometimes make a poor choice. If we find that the RTPC algorithm is making the wrong choice, it is possible to suppress the RTPC at runtime by using the %NORUNTIME keyword.

Once the query was running, we looked at the Processes page and found the process that was running the query. We found the cached query that it was running (the Routine). We went to that cached query and looked at its Query Plan. We found that it was using a Query Plan that was very different from the one we’d seem before, and it looked much less efficient.

RECOMMENDATIONS:

We recommended that the customer take the following actions:
1) Use the %NORUNTIME keyword when executing the query, forcing it to use the more efficient Query Plan
2) Build a new bitmap index called EncounterDate based on the EncounterTime field. Date-based indices can be faster than DateTime-based indices, and bitmap indices are often significantly faster than normal indices

Once they implemented these two recommendations, their query was now completing in ~6 hours, a 75% improvement.

FURTHER READING:

Check out @Benjamin.Spead's excellent collection of resources, which includes links to online documentation, InterSystems online learning courses, presentation slideshows, and Developer Community articles.
https://community.intersystems.com/post/sql-performance-resources

Benjamin De Boe · Jan 10

Nice article @Ben Schlanger !

I like how you're laying out the investigative process, though I think it's worth noting that every case is different and therefore recommendations also can differ. Especially the %NORUNTIME hint should be used with caution as it may deprive you of better plans in most scenarios. In fact, we like to say that any time you have to revert to that hint, it's worth opening a case with the WRC as it should be an improvement opportunity for our engine to make that better choice automatically (available statistics permitting) :-)

Also, I'd like to advertise a few improvements we've made since the IRIS version shown here:

Improved feedback in the query plan: we've been displaying a note in the query plan for a while now if there's a chance that different runtime parameter values may lead to a different plan, and as of IRIS 2023.3 are even calling out the specific predicates that drove the RTPC decision. For example, your plan may say "This query plan was selected based on the runtime parameter values that led to improved selectivity estimation of the range condition enc.EncounterTime BETWEEN '2022-01-01' AND '2023-12-31'"
Showing the actual runtime plan: Starting with IRIS 2023.3, we've enhanced the EXPLAIN and SMP utilities to no longer show the generic plan, after substituting out all parameter values, but the actual plan you'll get at runtime with the literal values you put in the query text. This addresses step #4 in the investigation described above.
SQL Process View: As of IRIS 2022.2, the Operations menu in the System Management Portal includes a "SQL Activity" link that leads you to a page listing all currently-running SQL statements, and allows you to drill through to the statement details and query plan. This also helps with step #4, and to identify any long-running queries in the first place. An aggregated form of this data is also available through the /api/metrics endpoint for consumption through a monitoring tool.
Query and schema recommendations: In IRIS 2024.3, released last month, we've further expanded the information contained in the query plan from the RTPC related notes described above, to also include warnings on indices marked as non-selectable (cf investigation step #1), indices that are being ignored because they have non-matching collation, whether the plan is frozen, and similar additional information that may help you improve the statement text, schema, or overall system settings

The above features are all specifically introduced to help diagnose long-running queries more quickly and identify how things can be sped up, but of course these versions also include general performance enhancements and refinements to the RTPC infrastructure too, so it'll be exciting to see how fast this customer's query runs on the latest and greatest IRIS release!

1 0