Here at InterSystems, we often deal with massive datasets of structured data. It’s not uncommon to see customers with tables spanning >100 fields and >1 billion rows, each table totaling hundred of GB of data. Now imagine joining two or three of these tables together, with a schema that wasn’t optimized for this specific use case. Just for fun, let’s say you have 10 years worth of EMR data from 20 different hospitals across your state, and you’ve been tasked with finding….
every clinician within your network
who has administered a specific drug
between the years of 2017-2019
to patients who reside outside the state
and have one of the following conditions [diabetes, hypertension, asthma]
where the cost was covered by Medicaid
I’ve seen our technology handle these sort of cases just fine, but the query may still take a while to run. Can it be faster though? Let me walk you through a sample investigation.
///////////////////////////////////////////////////////////////////////////////////////////////
The Need:
Find all patients who have had an outpatient encounter at a facility located in one of these counties in the year 2022 or 2023
The Query:
SELECT DISTINCT enc.Patient->PatientNumber
FROM EMR.Encounter as enc
INNER JOIN State_Facility.Address as fa on enc.Facility = fa.FacilityCode
INNER JOIN State_Geography.Cities as city ON city.Zip = fa.ZipCode
WHERE enc.EncounterTime BETWEEN '2022-01-01' AND '2023-12-31'
AND enc.EncounterType IN ('OP','Outpatient','O')
AND city.County IN ('Los Angeles County', 'Orange County', 'Riverside County', 'San Bernardino County', 'Ventura County')
The Performance:
The query was taking >24 hours to complete
INVESTIGATION STEPS:
1) Review the tables that you’re querying. What relationships or foreign keys exist between them? What indices already exist? Is your SQL query making good use of the ones that already exist? Do the indices have Status = Selectable?
We checked each field that was part of a WHERE, AND, or INNER JOIN. Most of them did have indices, including some bitmap indices. [NOTE: Further to the right of the screenshot page, the Status column shows that EncounterTypeIndex is Selectable]
2) Review the Query Plan. Does it make sense? Does it make use of the indices and relationships you expected it would? If not, does it seem more or less efficient?
Yes, the Query Plan showed effective use of the indices on EncounterType and StartTime. [NOTE: This screenshot is for a simplified version of the query that does not consider the zip code of the encounter facility]
3) Ensure the table statistics up to date by running Tune Tables
4) Check whether the actual Query Plan at runtime matches the one you were shown. The "Show Plan" Query Plan does not utilize the Runtime Plan Choice (RTPC) optimization when it generates a Query Plan, but the RTPC is utilized when the query is actually run. That is why the Show Plan Query Plan and the runtime Query Plan can be different. The RTPC algorithm usually finds an optimal choice, but it can sometimes make a poor choice. If we find that the RTPC algorithm is making the wrong choice, it is possible to suppress the RTPC at runtime by using the %NORUNTIME keyword.
Once the query was running, we looked at the Processes page and found the process that was running the query. We found the cached query that it was running (the Routine). We went to that cached query and looked at its Query Plan. We found that it was using a Query Plan that was very different from the one we’d seem before, and it looked much less efficient.
RECOMMENDATIONS:
We recommended that the customer take the following actions:
1) Use the %NORUNTIME keyword when executing the query, forcing it to use the more efficient Query Plan
2) Build a new bitmap index called EncounterDate based on the EncounterTime field. Date-based indices can be faster than DateTime-based indices, and bitmap indices are often significantly faster than normal indices
Once they implemented these two recommendations, their query was now completing in ~6 hours, a 75% improvement.
FURTHER READING:
Check out @Ben Spead's excellent collection of resources, which includes links to online documentation, InterSystems online learning courses, presentation slideshows, and Developer Community articles.
https://community.intersystems.com/post/sql-performance-resources
Nice article @Ben Schlanger !
I like how you're laying out the investigative process, though I think it's worth noting that every case is different and therefore recommendations also can differ. Especially the
%NORUNTIME
hint should be used with caution as it may deprive you of better plans in most scenarios. In fact, we like to say that any time you have to revert to that hint, it's worth opening a case with the WRC as it should be an improvement opportunity for our engine to make that better choice automatically (available statistics permitting) :-)Also, I'd like to advertise a few improvements we've made since the IRIS version shown here:
enc.EncounterTime BETWEEN '2022-01-01' AND '2023-12-31'
"EXPLAIN
and SMP utilities to no longer show the generic plan, after substituting out all parameter values, but the actual plan you'll get at runtime with the literal values you put in the query text. This addresses step #4 in the investigation described above.The above features are all specifically introduced to help diagnose long-running queries more quickly and identify how things can be sped up, but of course these versions also include general performance enhancements and refinements to the RTPC infrastructure too, so it'll be exciting to see how fast this customer's query runs on the latest and greatest IRIS release!