Question
· Jan 21, 2017

What is the base formula used for calculating global storage?

Can someone direct me to where in the documentation we can find how consumption may be calculated for global storage?

Caché Version2010.1
Operating SystemHP OpenVMS 8.4

EDIT: After receiving some responses, it seems I was unclear in my initial inquiry. I am looking to determine our rate of consumption of storage; however, I am having some difficulty in doing that.

While utilizing ^%GSIZE, which is used by the %GlobalEdit class, the results appeared odd. I have provided my results below, which illustrate the global structure on the left and the usage indicated by ^%GSIZE on the right.

Test GlobalBytes Used
^TEST=""8
^TEST(1)=""20

^TEST=""
^TEST(1)=""

16
^TEST(1,2)=""24

^TEST=""
^TEST(1,2)=""

20
^TEST=""
^TEST(1)=""
^TEST(1,2)=""
24

^TEST(1)=""
^TEST(2)=""

28

^TEST=""
^TEST(1)=""
^TEST(2)=""

24
Discussion (6)0
Log in or sign up to continue

This does not identify how consumption may be calculated; however, it does advise how we can determine how much storage a global has already consumed. I am looking for how to calculate consumption, which would help us in identifying our personal rate of consumption.

EDIT: It may be worth noting that when reviewing a global with this class the output was 0.008 for multiple tests, where ^%GSIZE provided varying bytes for each test.

 set mbSize=$$AllocatedSize^%GSIZE($Name(@Gref))/1048576 // Global @Gref size in MegaBytes

Pros:
- uses proven calculation algorithm, the same as interactive %GSIZE utility uses (we met errors in %GlobalEdit::GetGlobalSizeBySubscript() calculation, while it was in "rather old" v.2015.1.2, not sure about 2017.1);
- supports the whole global (Gref="^A") as well as its parts (Gref=$name(^A("subs1") or Gref=$name(^A("subs1","subs2")) or ...) as arguments.
Con(s):
- allows to perform only quick global size estimation, while in most cases it's quite enough;
- presize (slow) version can also be derived from %GSIZE source, but it's more tricky exercise...

My answer would be simple: monitor your database growth and do the same on global level. It can help to estimate the future growth and with some luck even to point out inefficient solutions that may waste the space. 

E.g. we have a task that calculates all globals sizes (usually once a night) and store this data in a global that can be used as a source for visualization and allerting by means of some external toolset. At the moment integration with Zabbix is implemented.

Hi Mack,

there isn't an easy answer for this. The amount of storage heavily depends on how your data and your global structure can be compressed. In storing your globals there are a number of mechanisms that optimize the storage used. It might be useful to have a look at some of the internal mechanisms for that: https://community.intersystems.com/post/internal-structure-cach%C3%A9-database-blocks-part-1.

As you can see from this article (which is still pretty high level), it won't be easy to create an accurate prediction mechanism.

As such, the best way to try this out, would be to just use a small amount of your source data and store it in Caché. This will give you a baseline of how much overhead you can expect. Depending on your data structures there might also be additional indices being created.

So if you try and store 10MB,100MB,1GB,1TB of your source data on a test system, you'll be able to get a pretty good prediction curve out of it with a low error rate.
Any other approach either is going to be too much guesswork, or going to need a lot more detailed work, so it would probably not be worth the time.

I tried to include an actual path forward for you, so I hope this helps!

HTH,
Fab