Creating Class With a Certain Global Name For Data Storage

Article

Evgeny Shvarov · Feb 19, 2022 2m read

#Data Model #Globals #SQL #Tips & Tricks #InterSystems IRIS

Hi developers!

As you probably noticed in IRIS 2021 the names of globals are random.

And if you create IRIS classes with DDL and want to be sure what global was created you probably would want to provide a name.

And indeed you can do it.

Use WITH %CLASSPARAMETER DEFAULTGLOBAL='^GLobalName' in CREATE Table to make it work. Documentation. See the example below:

CREATE TABLE Sample.Person (

   Surname VARCHAR(50) NOT NULL,

   Name VARCHAR(50) 

)  WITH %CLASSPARAMETER DEFAULTGLOBAL = '^Sample.Person'

This will generate the global storage of:

Storage Default

{

<Data name="PersonDefaultData">

<Value name="1">

<Value>Surname</Value>

</Value>

<Value name="2">

<Value>Name</Value>

</Value>

</Data>

<DataLocation>^Sample.Person.1</DataLocation>

<DefaultData>PersonDefaultData</DefaultData>

<ExtentLocation>^Sample.Person</ExtentLocation>

<IdFunction>sequence</IdFunction>

<IdLocation>^Sample.Person.1</IdLocation>

<Index name="DDLBEIndex">

<Location>^Sample.Person.2</Location>

</Index>

<Index name="IDKEY">

<Location>^Sample.Person.1</Location>

</Index>

<IndexLocation>^Sample.Person.I</IndexLocation>

<StreamLocation>^Sample.Person.S</StreamLocation>

<Type>%Storage.Persistent</Type>

}

Notice, that the data global is not ^Sample.Person, but ^Sample.Person.1

Robert Cemper · Feb 19, 2022

Thank you for raising and extending the subject again
that got not much echo from my article

Storage Considerations on large data sets

published in Sep.2021

1 0

Evgeny Shvarov · Feb 19, 2022

Thanks @Robert Cemper !

0 0

Sylvain Guilbaud · Feb 19, 2022

Thanks for sharing this explanation.

If you want to avoid to add the WITH clause in all your DDL statement, you can also modify this default behavior by using :

SET status=$SYSTEM.SQL.Util.SetOption("DDLUseExtentSet",0,.oldval)

do $SYSTEM.SQL.SetDDLUseExtentSet(0, .oldval)

Just tried - it makes the global name not random but ClassD like. So for Sample.Person it is ^Sample.PersonD.

Great life-hack, thanks @Sylvain Guilbaud !

Wow, thank you @Sylvain Guilbaud ! This is cool!

Dan Pasco · Feb 22, 2022

First of all, the global names are not random but are based on a hash algorithm to reduce the length of the global name and to reduce the probability of global name collisions. This is the default global name assignment when USEEXTENTSET is true. The benefits of using EXTENTSET mapping are many but primarily, the size of indexes is reduced substantially making index filing is faster. Queries using indexes are also likely faster with USEEXTENTSET mapping.

With EXTENTSET, the storage default global is used as a base value for the set of globals used by the extent. Each index, including the master map/master data index (MDI) - also known as the "DATALOCATION", is the base value (the EXTENTLOCATION) plus ".n" where "n" is a number computed when mapping globals to indexes. The master map/MDI always is ".1". In the original post, the DEFAULTGLOBAL setting overrides the hash computation of the EXTENTLOCATION.

Compare the index reference for a simple name index on Sample.Person between USEEXTENTSET = 1 and USEEXTENTSET = 0. ^Sample.PersonI("PersonNameIndex","DOE, JOHN Q", 100) vs ^Sample.Person.2("DOE, JOHN Q", 100).

The developer has to choose whether to use conveniently named globals or better performance.

Note that not all class definitions/tables will benefit from USEEXTENTSET mapping. But many will.

2 0

Evgeny Shvarov · Feb 22, 2022

Thanks for the explanation, @Dan Pasco!

Could you please explain why the size of indexes is reduced and why the queries are faster? Is it because of a shorter global name and path to the indexes?

For a lot of developers who got used to the meaningful global names this random hash can be an issue when you want to examine the global values for the related class/table or to make direct changes to it programmatically.

When not using EXTENTSET mapping, all indexes are stored by default in a single global. To keep the index structures separate the first subscript of the index global, by default, is the index name. This creates two conditions that impact performance negatively. Firstly, the index name subscript creates a longer reference for each index key. That leaves less room for index key subscripts and longer references consume resources. Secondly, the size of the index global is increased and the number of index key values per block is reduced. Fewer key values per block means more blocks read during query execution and a less efficient global cache. The index name subscript is essentially "noise".

Using EXTENTSET mapping removes the need for an index identifying subscript (the global is the index).

Ben Spead · Feb 22, 2022

@Dan Pasco - I always appreciate it when you jump into these threads ... I learn so much!!

Neuhold Werner · Feb 22, 2022

I have to disagree about the additional blocks. For an example create a Global with a first subscript named "TitleIndex" and another without this. After produce of 65 mio entries, compare the size with GSIZE and get the result of 178750 8K Blocks for both Indexglobals. And on the second hand, I never had need of a total subscript with more then 200 bytes.