Article
· 7 hr ago 4m read

Response times when using Dynamic and Embedded SQL

When working with InterSystems IRIS, database developers and architects often face a critical decision: whether to use Dynamic SQL or Embedded SQL for querying and updating data. Both methods have their unique strengths and use cases, but understanding their performance implications is essential to making the right choice. Response time, a key metric in evaluating application performance, can vary significantly depending on the SQL approach used. Dynamic SQL offers flexibility, as queries can be constructed and executed at runtime, making it ideal for scenarios with unpredictable or highly variable query needs. Conversely, Embedded SQL emphasizes stability and efficiency by integrating SQL code directly into application logic, offering optimized response times for predefined query patterns.

In this article, I will explore the response times when using these two types of SQL and how they depend on different class structures and usage of parameters. So to do this, I'm going to use the following classes from the diagram:

To test the times, I created the following numbers of objects for each class:

  • Patient - 50M
  • Visit - 150M
  • Doctor - 500K
  • Address - 50M

This way, I expected to see some reasonable time for running queries and to see the differences between executing Embedded SQL and Dynamic SQL. The only index that I added was the one-many automatic index for the Doctor-Visit relationship.

So let's look at the queries that I'm going to run, and then at the duration of execution:

  1. select count(*) from Hospital.Address
  2. select count(*) from Hospital.Address where State = :param
  3. select count(*) from Hospital.Patient left join Hospital.Address on p.address = a.id
  4. select count(*) from Hospital.Patient left join Hospital.Address on p.address = a.id where a.State = :param
  5. select count(a.Address->State) from Hospital.Patient a
  6. select count(*) from Hospital.Patient where p.Address->State = :param
  7. select count(p.Visit->VisitDate) from Hospital.Patient p
  8. select count(*) from Hospital.Patient where p.Visit->VisitDate > :param
  9. select count(v.Patient->Name) from Hospital.Visit v
  10. select count(*) from Hospital.Visit where v.Patient->Name %startswith :param
  11. select count(v.Patient->Address->State) from Hospital.Visit v
  12. select count(*) from Hospital.Visit where v.Patient->Address->State = :param
  13. select count(v.Doctor->Name) from Hospital.Visit v
  14. select count(*) from Hospital.Visit where v.Doctor->Name %startswith :param
  15. select count(*) into :p from Hospital.Visit where v.Doctor->Name %startswith :param and v.Patient->Name %startswith :param
  16. select count(*) into :p from Hospital.Visit where v.Doctor->Name %startswith :param and v.Patient->Name %startswith :param and v.Patient->Address->State = :param1

Obviously, the above is the syntax for Embedded SQL (because there are named parameters). For Dynamic SQL, the queries are almost the same, but instead of named parameters, I have unnamed ones 😉 For example, for the last one, I have the following query:

select count(*) from Hospital.Visit v where v.Doctor->Name %startswith ? and v.Patient->Name %startswith ? and v.Patient->Address->State = ?

Now let's look at the results:

No of query Embedded SQL (sec) Dynamic SQL (sec)
1 49 12
2 3 3
3 32 26
4 47 46
5 48 46
6 47 46
7 1767 1806
8 1768 1841
9 31 26
10 83 81
11 41 45
12 73 71
13 23 26
14 1 1
15 2 2
16 3 3

We can see one colossal outlier, which is the first query. The Embedded SQL took a lot more time to execute than Dynamic SQL. Running the same queries several times gave me more or less the same result. So it is what it is. 

In general, we can see that parent-children relationship is a lot slower from the side of the children property, even though all the data of Patient and Visit is stored in the Patient global. The index saved the day for the one-many relationship, and the execution time was considerably shorter. All in all, the response time is mostly similar and differs by less than 10%; sometimes, it is the same. Of course, I used simple queries that didn't take too long to prepare, so this stage could almost be ignored.

Discussion (3)5
Log in or sign up to continue

Dynamic SQL offers flexibility, as queries can be constructed and executed at runtime, making it ideal for scenarios with unpredictable or highly variable query needs. Conversely, Embedded SQL emphasizes stability and efficiency by integrating SQL code directly into application logic, offering optimized response times for predefined query patterns.

Just a quick note about Embedded vs Dynamic: in recent versions of IRIS, both types are compiled at runtime and both make use of Universal Query Cache, the differences in efficiency and performance should be minor, if any.

A couple of thoughts to this

1. Clearly all of the metrics depend on the presence of any indices.  For Query #1 it may have done a table scan vs a (bitmap) index scan.  However, you are comparing the same query so it should be an apples to apples comparison.

2  Did you perform the test and ensure the global cache was flushed.  For Query #1 if you ran the embedded query firstly, the side effect of that is much of the data was in the global cache which mean the Dynamic SQL query was able to do logical reads vs physical reads.  This could explain why it is faster.  I would expect that both techniques used the same exact Query plan and likely used the same exact generated query routine (%sqlcq*).  For that reason I wouldnt expect the times to be so very different.  BTW I dont know of a way to flush the global cache other than to restart IRIS and perform the statement again.  Another way of lookng at this is if you run a sql statement(whether Embedded or Dynamic) twice, the second time will always be faster as the data is now in the logical global cache.

3. I dont agree with "In general, we can see that parent-children relationship is a lot slower from the side of the children property, even though all the data of Patient and Visit is stored in the Patient global.".  The devil is in the details and the indices, where clause. 

For any of these queries without seeing the query plan we dont have the full story.  We also dont know

1. the return for the count(*)

2. were the queries purley index satisfiable or did they have to go back to the table(mastermap).  If they had to go to the table how "fat"(how many total columns and the overall lentgh of a row) was the table and how tall(how many total rows) was the table.    

Since I work a lot on performance issues, I tend to agree with your comments @Stephen Canzano .  

For me, the truth is the query plan and the number of GloRef \commands executed at runtime.

@Irene Mykhailova I wonder how the result would with a clear cache between each query.

We can clear the cache with : 

d ClearBuffers^|"%SYS"|GLOBUFF()

don't do this in production ;-)

In my own experience, the efficiency of the IRIS cache can hide problems or be misleading.

In recent versions, I think the difference could be marked for embedded SQL on the first run only if “sqlcompile mode=deferred” is not used.  deferred mode is the default mode for years and can be overridden with a compile qualifier (and maybe a system-wide parameter).