Question
· Jan 19

Global Size: Comparing different methods

I am inspecting our DB globals in order to reduce sizes of the worst offenders if possible. When I come to a large global, I am interested which of its nodes are the largest. Hence code below with unexpected different results. Any explanations why are results different?

testpartialGlobalsSize
 partialGlobalsSize(dirName,globalName)
partialGlobalsSize(dir,global)
sub="",path="C:\Cachesys\mgr\"_dir_"\",searchGlobal="^"_global
x=##Class(%GlobalEdit).GetGlobalSize(path, global,.Alloc,.Size)
global,",",Size,! ; Got  725 here
x=##Class(%GlobalEdit).GetGlobalSizeBySubscript(path,global,global,.Size)
global,",",Size,! ; Got  1311.9 here
totalSize=0
F  {
        S sub=$O(@searchGlobal@(sub)) Q:sub=""
        S x=##Class(%GlobalEdit).GetGlobalSizeBySubscript(path,global_"("""_sub_""")",global_"("""_sub_""")",.Size)
        S totalSize=totalSize+Size
}
global,",",totalSize,! ; Got  8.56 here
; Got 760 from ^%GSIZE
Q

Product version: Caché 2017.1
Discussion (6)2
Log in or sign up to continue

from my local Class docs:

GetGlobalSizeBySubscript
This method will return the size of a global based on the number of database blocks the global resides in.

so you get blocks * blocksize ===> ALLOCATED size
Depending on Packing as %GSIZE shows   The difference might be significant

Summary by Subscript will most likely show higher values as a pointer block
or even a data block (eg. with 8kb) may contain more than a single subscript
depending on the Global structure.
Take the default top subscript  (aka. IDKEY) for Caché classes (Integer, >0) as an example ​​

Thanks, Robert. A couple of additional questions.

  • Would you say the Used Size from GetGlobalSize (Size argument from code above; 725 MB) is more correct or the %GSIZE result is more correct (760 MB)?
  • What's your take on GetGlobalSizeBySubscript when called for all first level subscripts yielding a magnitude lesser number in total, 8.56 MB? Could it be it returns only contents of purely ^GLOBAL(sub1) and not returning contents of ^GLOBAL(sub1,sub2)? BTW, I saw in the debugger that  GetGlobalSizeBySubscript calls %GSIZE internally.
  • my personal preference goes to %GSIZE. The best match between CONSUMED and ALLOCATED size is found if your Global is "filled" total sequentially by a $Q() loop. And even then with a big string, you may force unexpected block splits. The situation changes if you fill your globals by subscript levels. This may cause a cascade of block splits and result  in rater unattractive packing percentages  
  • ALLOCATED Size by Subscript might be of interest for an individual Subscript. Though adding them up doesn't reflect the total size It's like cutting a cake to 12 people and then counting the heads that had some cake.  
  • to reduce the space consumption of your globals I rather suggest to  use ##Class(%GlobalEdit).GloabelCompact()  to eliminate the effects of random inserts and growth inside a global tree.   

GetGlobalSizeBySubscript differs from GetGlobalSize insofar that GetGlobalSize focuses on the data blocks, while GetGlobalSizeBySubscript counts pointer blocks at each level as well and includes them in the total size count. Additionally, in your code, you specify global(sub) for both the start and end nodes when you likely want to use this:

x=##Class(%GlobalEdit).GetGlobalSizeBySubscript(path,global_"("""_sub_""")","",.Size)

The latter measures that entire subnode, as per the examples in the Class Reference.

Robert also raises a good point about Used vs. Allocated. GetGlobalSize (and GSIZE detailed view) return the Used size alongside Allocated, and Used may be less than Allocated due to packing.

Everybody, thanks for reading and Robert and Sarah for replying.

  • Yes, GetGlobalSizeBySubscript returns Allocated Size only
  • Changing the third GetGlobalSizeBySubscript argument to an empty one does not make a difference in returned Size.
  • GetGlobalSize and ^%GSIZE return Used Size which is still different but not dramatically.
  • I also found an bug in my code. The lengthy InterSystems instructions to the GetGlobalSizeBySubscript code mention: ""Size - Maximum number of MB to count...Be careful to RESET this for multiple calls to the method". I'd say such a Size definition is counterintuitive but at least I got totalSize=1321.57, very close to the result from GetGlobalSize from GetGlobalSizeBySubscript(path,global,global,.Size). Final code with results is:

partialGlobalsSize(dir,global)
sub="",path="C:\Cachesys\mgr\"_dir_"\",searchGlobal="^"_global,Alloc=0,Size=0
x=##Class(%GlobalEdit).GetGlobalSize(path, global,.Alloc,.Size)
global,",",Alloc,",",Size,!  Alloc=1312,725 here, Size=725
Size=0
x=##Class(%GlobalEdit).GetGlobalSizeBySubscript(path,global,"",.Size)
global,",",Size,! 1311.9 here
totalSize=0
F  {
    /// Size - Maximum number of MB to count. If the size of the global exceeds this value,
    /// calculation stops, and an error is returned. If undefined or set to 0, then the entire range is counted.
    /// Be careful to RESET this for multiple calls to the method
    Size=0 ; Resetting!
    sub=$O(@searchGlobal@(sub)) Q:sub="" ;$D(@("^MSCG("_t_")"))
    x=##Class(%GlobalEdit).GetGlobalSizeBySubscript(path,global_"("""_sub_""")","",.Size)
    totalSize=totalSize+Size
}
global,",",totalSize,! 1321.57 here
Q