Question
· Jun 6

Shared code execution speed

Let's suppose two different routines use one and the same chunk of code. From the object-oriented POV, a good decision is to have this chunk of code in a separate class and have both routines call it. However, whenever you call code outside of the routine as opposed to calling code in the same routine, some execution speed is lost. For reports churning through millions of transactions this lost speed might be noticeable. Any advice how to optimize specifically speed?
P.S. Whenever someone is talking about the best choice for whatever, I am always tempted to ask: "What are we optimizing?". Optimizing speed here.
P.P.S. I did notice that some classes speed is very different comparing with old style utilities, while doing largely the same, like exporting.

Product version: Caché 2017.1
Discussion (14)3
Log in or sign up to continue

Loading compiled obj code from cache to partition should not have any remarkable impact.
But you are right by principle ! It's some kind of overhead and not for free.

If you place the affected code into a .INC routine you may share that piece
rather easy over multiple instances.
Though mostly not used in that way any Include may also contain executable code.
For a :MAC routine it's  nothing impressive.
For Class code it's a bit tricky but works as well

example ANNA.INC

anna(name) ;
 write !,"Hello ",name,!
 quit ">>>"_name_"<<<"

example Anna.CLS
 

/// demo for Anna
Include ANNA
Class A.Anna {
ClassMethod demo(name As %String) As %String
{
	quit $$anna(name)
}
}

It works:

SAMPLES>write "===",##class(A.Anna).demo("robert")
===
Hello robert
>>>robert<<<
SAMPLES>

So multiple loading is reduced.

You have of course also the option to compose a Custom Command in %ZLANG***.MAC
I just have no experience of how this impacts partition loading.
 

...whenever you call code outside of the routine as opposed to calling code in the same routine, some execution speed is lost. For reports churning through millions of transactions this lost speed might be noticeable.

Can you reproduce this with a small/simple self contained example using code in same class/routine and same code calling different classes/routines where the difference is noticeable?

I did this to verify my approach looping over a simulated table of 100 mio rows

  • SQL procedure TEST1 uses an external Class Method based on anna.INC
  • SQL procedure TEST2 uses an internal Class Method based on anna.INC

The difference is evident:

[SQL]SAMPLES>>select A.HUGE_fill(100000000)
18.     select A.HUGE_fill(100000000)
 
| Expression_1 |
| -- |
| ^A.HUGED=100000000 |
 
1 Rows(s) Affected
statement prepare time(s)/globals/cmds/disk: 0.0008s/5/828/0ms
          execute time(s)/globals/cmds/disk: 18.8332s/100,000,002/200,000,445/0ms
                                query class: %sqlcq.SAMPLES.cls3
---------------------------------------------------------------------------
[SQL]SAMPLES>>select list(A.HUGE_TEST1(ID)) from A.HUGE
19.     select list(A.HUGE_TEST1(ID)) from A.HUGE
 
| Aggregate_1 |
| -- |
|  |
 
1 Rows(s) Affected
statement prepare time(s)/globals/cmds/disk: 0.0005s/4/141/0ms
          execute time(s)/globals/cmds/disk: 101.5573s/100,000,001/700,000,424/0ms
                                query class: %sqlcq.SAMPLES.cls2
---------------------------------------------------------------------------
[SQL]SAMPLES>>select list(A.HUGE_TEST2(ID)) from A.HUGE
20.     select list(A.HUGE_TEST2(ID)) from A.HUGE
 
| Aggregate_1 |
| -- |
|  |
 
1 Rows(s) Affected
statement prepare time(s)/globals/cmds/disk: 0.0005s/4/141/0ms
          execute time(s)/globals/cmds/disk: 72.1640s/100,000,001/700,000,424/0ms
                                query class: %sqlcq.SAMPLES.cls1
---------------------------------------------------------------------------
[SQL]SAMPLES>>

Rough calculation: including the code in the class saves ~30% of execution time

my class code

Include anna
Class A.HUGE Extends (%Persistent, %Populate)
{
Property calc As %Integer [ Calculated, SqlComputeCode = { set {*}={%%ID}}, SqlComputed ];
ClassMethod fill(size) As %String [ SqlProc ]
{
	for i=1:1:size set ^A.HUGED(i)=""
	set ^A.HUGED=i
	quit $ZR_"="_@$ZR
}
ClassMethod test1(val) As %String [ SqlProc ]
{
	quit ##class(A.PERSON).Anna(val)
}
ClassMethod test2(val) As %String [ SqlProc ]
{
	quit $$anna(val)
}

The simplified anna,INC just returns NullString to concentrate on code switching

anna(name) 
 quit ""

OK - in UDL

IRIS for Windows (x86-64) 2024.3 (Build 217U) Thu Nov 14 2024 17:59:58 EST

ROUTINE anna [Type=INC] 
anna(name) 
 quit ""

A.HUGE.cls

Include anna 
Class A.HUGE Extends (%Persistent, %Populate)
{ 
 Property calc As %Integer [ Calculated, SqlComputeCode = { set {*}={%%ID}},    SqlComputed ]; 

ClassMethod fill(size) As %String [ SqlProc ]
{
 for i=1:1:size set ^A.HUGED(i)=""
 set ^A.HUGED=i
 quit $ZR_"="_@$ZR
} 

ClassMethod test1(val) As %String [ SqlProc ]
{
 quit ##class(A.PERSON).Anna(val)
} 
ClassMethod test2(val) As %String [ SqlProc ]
{
 quit $$anna(val)
} 
Storage Default
{
 <Data name="HUGEDefaultData">
 <Value name="1">
 <Value>%%CLASSNAME</Value>
 </Value>
 </Data>
 <DataLocation>^A.HUGED</DataLocation>
 <DefaultData>HUGEDefaultData</DefaultData>
 <IdLocation>^A.HUGED</IdLocation>
 <IndexLocation>^A.HUGEI</IndexLocation>
 <StreamLocation>^A.HUGES</StreamLocation>
 <Type>%Storage.Persistent</Type>
} 
}

A.PERSON.cls

Include anna 
Class A.PERSON Extends %Persistent
{ 
 Property calc As %Integer [ Calculated, SqlComputeCode = { set {*}={%%ID}},  SqlComputed ];

ClassMethod fill(size) As %String [ SqlProc ]
{
 for i=1:1:size set ^A.PERSOND(i)=""
 set ^A.PERSOND=i
 quit $ZR_"="_@$ZR
} 

ClassMethod test1(val) As %String [ SqlProc ]
{
 quit ##class(A.PERSON).Anna(val)
} 

ClassMethod Anna(name As %String) As %String
{
 quit $$anna(name)
} 

Storage Default
{
 <Data name="PERSONDefaultData">
 <Value name="1">
 <Value>%%CLASSNAME</Value>
 </Value>
 </Data>
 <DataLocation>^A.PERSOND</DataLocation>
 <DefaultData>PERSONDefaultData</DefaultData>
 <IdLocation>^A.PERSOND</IdLocation>
 <IndexLocation>^A.PERSONI</IndexLocation>
 <StreamLocation>^A.PERSONS</StreamLocation>
 <Type>%Storage.Persistent</Type>
} }

Ciao Robert,

I'm not sure your test address the original question, that is:

"whenever you call code outside of the routine as opposed to calling code in the same routine, some execution speed is lost"

In your test1 you do:

[call class method] -> [call class method in other class] -> [call a function in same class]

In your test2 you do:

[call class method] -> [call a function in same class]

This way you are not comparing "call code outside of the routine as opposed to calling code in the same routine", you are ADDING an additional call/level.

In addition, to measure the time penalty for calling inside/outside "routine" (or class), adding 100M global access does not help in getting a reliable measure because too many factors may change the time measured between different runs.
Finally a doubt, are we sure that calling a function and calling a class method is a fair comparison? May be or may be not, I don't know.

My approach was rather simple.

  • in runtime any class has its .INT wich has its .OBJ
  • the OBJ is in the partition.
    • if I stay inside the .OBJ  it's fine
    • if I have to load another .OBJ and then reload the original .OBJ it consumes processor cycles
    • both .OBJ can be assumed to be cached, so it's a pure memory exercise
  • the difference of both variants is sub microscopic
    • so looping for 100 M is kind of zoom-in to get something visible
    • The 100 M are common to both scenarios and
    • the Global has only (800 Mb) >>> 8 bytes / record counted by 
  • I decided for SQL Shell for its nice runtime display.

SUMMARY: There is a difference.
But I wouldn't bend a little finger to attack it. (not even on PDP-11)
This is nothing where performance comes from.  

much more simple with 2 identic .INT routines a1 and a2​​

ROUTINE a1 [Type=INC]  
load ;
  read !,"loops=",loop,! 
  do t1 hang 0.5 do t2 quit
next 
  set t1=$zh quit 
t1 
  set t0=$zh
  for i=1:1:loop do next
  write t1-t0,!
  quit 
t2 
  set t0=$zh
  for i=1:1:loop do next^a2
  write t1-t0,!
  quit

SAMPLES>d ^a1
loops=1000000
.081626
.136785
SAMPLES>

I just mean you can't do less:
the difference is even worse 40.3%

To mesure the difference I've a differente approach then Robert and concentrate on same/different class calls without any additional overhead.
Please note that is my no means a demonstration of best/worst approach, IMHO there is a better solution, see my next post.

I've created two classes and simulated a big number of calls of a class method within the same class and same code calling a class method in a different class.

Class Community.perf.Class1
{

ClassMethod Compare(NumCalls As %Integer)
{
	Set SingleStart=$zh
	Do ..SingleClassCalls(NumCalls)
	Set SingleEnd=$zh
	Set MultiStart=$zh
	Do ..MultiClassCalls(NumCalls)
	Set MultiEnd=$zh
	Set Difference=(MultiEnd-MultiStart)-(SingleEnd-SingleStart)
	Set Percent=1-((SingleEnd-SingleStart)/(MultiEnd-MultiStart))*100
	Write "Same class calls: ",SingleEnd-SingleStart,!
	Write "Diff class calls: ",MultiEnd-MultiStart,!
	Write "Difference: ",Difference," ",Percent,"%",!
}

ClassMethod SingleClassCalls(NumCalls As %Integer)
{
	For i=1:1:NumCalls {
		Set x=..Compute(NumCalls)
	}
}

ClassMethod MultiClassCalls(NumCalls As %Integer)
{
	For i=1:1:NumCalls {
		Set x=##class(Community.perf.Class2).Compute(NumCalls)
	}
}

ClassMethod Compute(Num As %Integer)
{
	;Quit Num
	Set ret=Num
	Quit ret
}

}
Class Community.perf.Class2
{

ClassMethod Compute(Num As %Integer)
{
	;Quit Num
	Set ret=Num
	Quit ret
}

}

Calling the Compare() method with 100M iterations the result is:

EPTEST>do ##class(Community.perf.Class1).Compare(100000000)
Same class calls: 20.992929
Diff class calls: 31.460201
Difference: 10.467272 33.27147210534351%

Please note that changing the Compute() method in both classes to:

ClassMethod Compute(Num As %Integer)
{
    Quit Num
}

Makes a BIG difference:

EPTEST>do ##class(Community.perf.Class1).Compare(100000000)
Same class calls: 4.606181
Diff class calls: 5.52639
Difference: .920209 16.6511773508565266%

Using the first method code it adds the handling of the stack, therefore it takes longer and, for 100M calls, the difference is noticeable.
I think the first with "some" stack handling is a more realistic use case.

In conclusion, there is a difference and is measurable. Is it noticeable? Not much, IMHO in a computation that probably takes many minutes duplicating codes is not worth the gain.

There is however a better approach without duplicating the code, see my next post.

IMHO the best approach is to take advantage of the  object-oriented development environment that IRIS provide and have the common functions/methods in a single (or multiple) classes, possible abstract classes, and inherit them in the "main" class.

Class Community.perf.ClassMain Extends Community.perf.ClassAbs
{

ClassMethod Compare(NumCalls As %Integer)
{
	Set Start=$zh
	Do ..ClassCalls(NumCalls)
	Set End=$zh
	Write "Class calls: ",End-Start,!
}

ClassMethod ClassCalls(NumCalls As %Integer)
{
	For i=1:1:NumCalls {
		Set x=..Compute(NumCalls)
	}
}

}
Class Community.perf.ClassAbs [ Abstract ]
{

ClassMethod Compute(Num As %Integer)
{
	;Quit Num
	Set ret=Num
	Quit ret
}

}

How about performance?

EPTEST>do ##class(Community.perf.ClassMain).Compare(100000000)
Class calls: 31.675438

In latest version of IRIS (and Cachè?) inherited members method code is no longer duplicated, so there is no difference then using separate classes but I think this approach is more modern, elegant and, depending on situations, MUCH more flexible,

First, measuring execution times on modern operating systems where multiple processes run in parallel (on multiple CPUs) is challenging. The following demo application assigns a value to a variable in four different ways:
– in a single method
– in two methods, both in the same class
– in two methods where one method code is in an inherited class, and
– in two methods where one method is in a different class

As expected, the first is the fastest (keyword: loop unrolling) and the last is the slowest, the other two take about the same time.

Class DC.Times Extends (%RegisteredObject, TimesAbstract)
{
ClassMethod ShowTimes()
{
	while $zh#1 {} set t1=$zh for i=1:1:1E6 { do ..Complete() } set t1=$zh-t1
	while $zh#1 {} set t2=$zh for i=1:1:1E6 { do ..OneClass() } set t2=$zh-t2
	while $zh#1 {} set t3=$zh for i=1:1:1E6 { do ..InhClass() } set t3=$zh-t3
	while $zh#1 {} set t4=$zh for i=1:1:1E6 { do ..TwoClass() } set t4=$zh-t4
	write $j(t1,9,5), $j(t2,9,5), $j(t3,9,5), $j(t4,9,5),!
}
/// The complete application is carried out in one method
ClassMethod Complete()
{
	set x=12345
	set y=12345
}
/// The entire application is done in the same class, but with different methods
/// Both methods are local (OneClass + LocTask)
ClassMethod OneClass()
{
	set x=..LocTask()
	set y=..LocTask()
}
/// The entire application is done in the same class, but with different methods
/// One method is local (InhClass) the other is inherited (InhTask)
ClassMethod InhClass()
{
	set x=..InhTask()
	set y=..InhTask()
}
/// The entire application uses two methods in two different classes
ClassMethod TwoClass()
{
	set x=##class(DC.Times2).ExtTask()
	set y=##class(DC.Times2).ExtTask()
}
/// As an "application" we simply return a constant value
ClassMethod LocTask(val)
{
	quit 12345
}
}

Class DC.Times2 Extends %RegisteredObject
{
/// As an "application" we simply return a constant value
ClassMethod ExtTask(val)
{
	quit 12345
}
}

Class DC.TimesAbstract [ Abstract ]
{
/// As an "application" we simply return a constant value
ClassMethod InhTask(val)
{
	quit 12345
}
}

Some time values

USER>

USER>f i=1:1:3 d ##class(DC.Times).ShowTimes()
  0.10833  0.19660  0.19649  0.22001
  0.10837  0.19657  0.19608  0.22000
  0.10826  0.19661  0.19603  0.21992

USER>

USER>f i=1:1:3 d ##class(DC.Times).ShowTimes()
  0.10998  0.19711  0.19643  0.22006
  0.10830  0.19657  0.19624  0.22013
  0.10822  0.19684  0.19628  0.22139

USER>

USER>w $zv
IRIS for UNIX (Ubuntu Server 22.04 LTS for x86-64) 2025.1 (Build 225_1U) Fri May 16 2025 12:18:04 EDT
USER>

Second, the choice of development method (use of include files, class inheritance, multiple classes, a large method in an even larger class, etc.) depends on other factors such as maintainability, runtime priority, etc.

Let's suppose two different routines use one and the same chunk of code. From the object-oriented POV, a good decision is to have this chunk of code in a separate class and have both routines call it. However, whenever you call code outside of the routine as opposed to calling code in the same routine, some execution speed is lost. For reports churning through millions of transactions this lost speed might be noticeable. Any advice how to optimize specifically speed?

What you are asking is very similar to Inline function in C.

In Caché, macros and/or preprocessor directives are great for this role. Especially if the code size is small: ObjectScript Macros and the Macro Preprocessor

In this case you will avoid the overhead of calling goto, do, job, xecute, etc.

 
Example (procedure "swap")

When implementing a sorting algorithm doing lots of swaps, this can increase the execution speed.

PS: And yes, avoid passing input/output parameters and class methods due to the high overhead of calling them.