Best Ensemble Global Buffer Configuration
Usually I recommend to set the global buffer (GB) as large as possible in order to maintain in memory more globals as possible. I don't know if this is a standard recommendation, but I want to discuss what should be the best buffer size for a Ensemble instance that only process messages.
I mean, if the instance only runs productions that pass messages then the 90% of the global access is used for persist the messages. In this situations a customer told me that he prefer to have a short global buffer because in this way the rest of the OS memory is available for use by the rest of the OS processes. He said that, for him, is more common to have a lot ot Ensemble jobs working and consuming memory but not intensive in the GB consumption. In this scenario if we set the GB too high then the system begin to swap memory pages to disk and this decrease performance. This happen because the GB is allocated from shared memory and this action is done at start getting the memory in an exclusive way.
I will thank any comment about memory setting for Ensemble productions. Pros and Cons, use cases, etc.
Regards
In any way it is possible to check how this Global Buffer uses, with very useful tool ^GLOBUFF. And if you see that about 80-90% of buffers have already used, it means that you should increase Global Buffer size.
While ^GLOBUFF is useful for showing the makeup of the current global buffer pool it is not really useful for sizing the buffer pool. If for example ^GLOBUFF shows that ^MyLogGlobal is using a lot of the buffer pool you may want to investigate if you are logging too much data and if this is pushing your main application globals out of the buffer pool sooner.
Most applications will not have sufficient memory to allow their entire working set to reside in the buffer pool so it is perfectly normal for the buffer pool to be 100% used as shown by ^GLOBUFF. The point of the buffer pool is to cache values in memory to avoid disk IO, so to size this use a tool like ^mgstat which shows global references and and physical reads. If the physical reads is too high try increasing the buffer pool in the hope that caching more data will reduce the disk IO.
On the original question my general rule is the user should be able to calculate how many processes they will need to run their application and measure how much memory each process takes and so calculate how much memory will be needed for this. Then based on the total available memory on the machine they can subtract this out, and allocate say 70% of the remaining to the global buffer pool. However like any performance related issue the only way to know for sure is to measure and be willing to update your configuration based on these measurements.
GLOBUFF is the nice tool for showing current distribution of a cached data (given current memory limitations, past activity and current algorithm), but what is more important is to estimate a maximum of working set for your applications using your own globals. Which might be quite big if yu have transactions-heavy application.
[Worth to mention that for ^GLOBUFF case you could quite unique information - these % of memory used by process-private variables, which will be essential information for SQL heavy applications]
But for me it's always interesting to know - what amount of memory would be enough to keep all your globals in memory? Use ^%GSIZE:
10:57 AM Jul 10 2015
sber.Err 150 sber.data.AccountD 2833662
sber.data.AccountI 1234223 sber.log.LoadAccount 88
sber.log.LoadAccount1Proc 88 sber.log.Transaction 11
sber.log.generateAccount 3 sber.log.generateAccount0 1
sber.log.generateTransaction 4 sber.log.loadBatch 7
sber.log.loadBatch0 1 sber.tmp.Account 999784
TOTAL: 5068022
So we see 5 millions of 8KB blocks necessary to hold whole application dataset, this might serve as upper estimate for the memory necessary for your applications. [I intentionally omit here needs for routine buffer, which is usually is negligible comparing to the database buffers, it's very hard to find application code which, in its OBJ form, would occupy multiple gigabytes, or even single GB]
All in all, for the case shown above we started from 40GB of global buffers for this application, because we had plenty of RAM.
Though, returning to the original question, I tend to agree with the recommendation to use ^GLOBUFF, because of dynaic nature of a data in the Ensemble configuration. There is no hard data, but consistent statistics.
Hi all, I'd like to offer some input here. Ensemble workloads are traditionally mostly updates when used as purely message ingestion, some transformations, and outbound to one or more outbound interfaces. As a result, expect to see low Physical Reads rates (as reported in ^mgstat or ^GLOSTAT), however if there are additional workloads such as reporting or applications built along with the Ensemble productions they may have do a higher rate of physical reads.
As a general rule to size memory for Ensemble we use 4GB of RAM for each CPU (physical or virtual CPU) and then use 50-75% of that RAM for global buffers. So in a 4 core system, the recommendation is 16GB of RAM with 8-12GB allocated to the global buffers. This would leave 4-8GB for OS kernel and Ensemble processes. When using very large memory configurations (>64GB), using the 75% rule rather than only 50% is ideal because the OS kernel and processes won't need so much memory.
One additional note is we highly recommend the use of huge_pages (Linux) or Large_pages (Windows) to provide a much more efficient memory management.
Hey Mark,
quick question would this work if for instance you're running 3 separate instances on a host.
Social networks
InterSystems resources
Log in or sign up
Log in or create a new account to continue
Log in or sign up
Log in or create a new account to continue
Log in or sign up
Log in or create a new account to continue