Allocation of Disk Space - Splitting of IRIS.dat into 2 different files

Question

Question

Scott Roth · Sep 2, 2024

#Databases #Namespace #System Administration #HealthShare #InterSystems IRIS for Health #Health Connect #InterSystems IRIS

Currently we are exploring how we can allocate additional disk space to our current environment as we have seen a significant increase in growth of our Database files. Currently we have 3 namespaces, all with 1 IRIS.dat each that contains both the Global and Routines.

Since we have started down the route of everything within a single IRIS.dat file for each namespace, is it logical as we see growth to be able to split the current IRIS.dat for each namespace into a separate IRIS.dat for global and a IRIS.dat with for routines for each namespace in a Mirror environment?

I assume that we would have to take each one of the nodes down at a single time to do this. Is this even logical to do?

Or do we add a LUN and move everything over to that new LUN as is still maintaining the single IRIS.dat per namespace?

Thanks

Scott

Product version: IRIS 2022.1

$ZV: IRIS for UNIX (Red Hat Enterprise Linux 8 for x86-64) 2022.1.4 (Build 812_0_22913U) Thu Dec 7 2023 17:06:30 EST [HealthConnect:3.5.0-1.m1] [HealthConnect:3.5.0-1.m1]

Discussion (4)2

Log in or sign up to continue

Alexander Pettitt · Sep 3, 2024

If at the OS layer you used LVM and XFS you could just add a LUN to the volume group and grow the filesystems. This can be done with everything up.
The backup solution for me determines what a large file is. For me that is anything over 1 terabyte. We use External Backups.
Most backup solutions only have one process per file. This means fewer and larger IRIS.DATs will always be slower to backup and restore than more and smaller IRIS.DATs.
The growth pattern needs to be understood. If it is going to just grow forever it has to be broken apart and it will be easier while it is small.
In your place I would upgrade to the 2024 version and explore Multivolume Databases.
8K IRIS.DATs have a max size of 32 Tb though Intersystems is working on this.

1 0

David.Satorres6134 · Sep 3, 2024

Hi Scott,

Maybe you could can simply use mapping to split the globals from each namespace in different databases and have them in different disks. In my experience, this speeds up the system as reading from different disks at the same time avoids the problem on the max IO reached by the reader daemon. And you can even have different blocksizes that match better with the global structure.

Splitting routines and globals might sound nice, but you'll end up with a very small one (routines) and a massive one (the data).

0 0

Yaron Munz · Sep 3, 2024

Hi Scott,

My remarks:

As mentioned here:

1. As already mentioned by @David.Satorres6134, global mapping for specific globals (or subscript level mapping) to different databases (located on different disks) may give you a solution for space, and also increase your overall performance.
2. Using ^GBLOCKCOPY is a good idea when there are many small globals. For a very big globals, it will be very slow (since it uses 1 process/global) so I recommend writing your own code + using the "queue manager" to do merges between databases for 1 global in parallel.

1 0

score 0 · Answer 1 · 2024-09-02T23:06:09-04:00

Hi Scott,

my first thought here was use ^GBLOCKCOPY.

GBLOCKCOPY can copy all globals from one database into a new database or namespace.

So in your case create a new namespace with the correct mappings and new default databases. then use ^GBLOCKCOPY to copy all globals into this namespace. The copy to namespace follows the mapping definition for the namespace and should split Global(data) and Routine(also in globals) into the respective default databases.
This new databases can also created mirrored and then would be populated on all nodes respectively.

A caveat though i would test this first as i am a bit unsure if it really gets all globals located in a database(including system), although i am pretty sure it does.