InterSystems Data Platforms and performance – VM Backups and Caché freeze/thaw scripts

In this post I show strategies for backing up Caché using External Backup with examples of integrating with snapshot based solutions. The majority of solutions I see today are deployed on Linux on VMware so a lot of the post shows how solutions integrate VMware snapshot technology as examples.

Caché backup - batteries included?

Caché online backup is included out of the box with a Caché install for uninterrupted backup of Caché databases. But there are more efficient backup solutions you should consider as systems scale up. External Backup integrated with snapshot technologies is the recommended solution for backing up systems including Caché databases.

Are there any special considerations for external backup?

Online documentation for External Backup has all the details. A key consideration is:

"To ensure the integrity of the snapshot, Caché provides methods to freeze writes to databases while the snapshot is created. Only physical writes to the database files are frozen during the creation of the snapshot, allowing user processes to continue performing updates in memory uninterrupted."

It is also important to note that part of the snapshot process on virtualised systems causes a short pause on a VM being backed up, often called stun time. Usually less than a second so not noticed by users or impacting system operation, however in some circumstances the stun can last longer. If the stun is longer than the QoS timeout for Caché database mirroring then the backup node will think there has been a failure on the primary and will fail over. Later in this post I explain how you can review stun times in case you need to make changes to the mirroring QoS timeout.

A list of other posts in the InterSystems Data Platforms and performance series is here.

For this post you should also review Caché online documentation Backup and Restore Guide.

Backup choices

Minimal Backup Solution - Caché Online Backup

If you have nothing else this comes in the box with the InterSystems data platform for zero downtime backups. Remember Caché online backup only backs up Caché database files, capturing all blocks in the databases that are allocated for data with output written to a sequential file. Caché Online Backup supports cumulative and incremental backups.

In the context of VMware, a Caché Online Backup is an in-guest backup solution. Like other in-guest solutions Caché Online Backup operations are essentially the same whether the application is virtualised or runs directly on a host. Caché Online Backup must be coordinated with a system backup to copy the Caché online backup output file to backup media along with all other file systems in use by your application. At a minimum system backup, must include the installation directory, journal and alternate journal directories, application files, and any directory containing external files used by the application.

Caché Online Backup should be considered as an entry-level approach for smaller sites wishing to implement a low cost solution to backup only Caché databases or ad-hoc backups, for example it is useful in the set-up of mirroring. However, as databases increase in size and as Caché is typically only part of a customer's data landscape External Backups combined with snapshot technology and third-party utilities are recommended as best practice with advantages such as including the backup of non-database files, faster restore times, enterprise-wide view of data and better catalogue and management tools.

Recommended Backup Solution - External backup

Using VMware as an example; Virtualising on VMware adds additional functionality and choices for protecting entire VMs. Once you have virtualised a solution you have effectively encapsulated your system — including the operating system, the application and the data — all within .vmdk (and some other) files. When required these files can be very easy to manage and used to recover a whole system, which is very different from the same situation on a physical system where you must recover and configure the components separately -- operating system, drivers, third-party applications, database and database files etc etc.

VMware snapshot

VMware’s vSphere Data Protection (VDP) and other third-party backup solutions for VM backup such as Veeam or Commvault take advantage of the functionality of VMware virtual machine snapshots to create backups. A high-level explanation of VMware snapshots follows, for more details see the VMware documentation.

It is important to remember that snapshots are applied to the whole VM and that the operating system and any applications or the database engine are unaware that the snapshot is happening. Also, remember:

By themselves VMware snapshots are not backups!

Snapshots enable backup software to make backups, but they are not backups by themselves.

VDP and third-party backup solutions use the VMware snapshot process in conjunction with the backup application to manage the creation, and very importantly, deletion of snapshots. At a high level the process and sequence of events for an external backup using VMware snapshots is as follows:

  • Third-party backup software requests ESXi host to trigger a VMware snapshot.
  • A VM's .vmdk files are put into a read only state and a child vmdk delta file is created for each of the VM's .vmdk files.
  • Copy on write is used with all changes to the VM written to the delta files. Any reads are from the delta file first.
  • The backup software manages copying of the read-only parent .vmdk files to the backup target.
  • When the backup is complete the snapshot is committed (VM disks resume writes and updated blocks in delta files written to parent).
  • The VMware snapshot is now removed.

Backup solutions also use other features such as Change Block Tracking (CBT) to allow incremental or cumulative backups for speed and efficiency (especially important for space saving), and typically also add other important functionality such as data deduplication and compression, scheduling, mounting VMs with changed IP addresses for integrity checks etc, full VM and file level restores, and catalogue management.

VMware snapshots that are not managed properly or left to run for a long time can use excessive storage (as more and more data is changed delta files continue to grow) and also slow down your VMs.

You should think very carefully before you run a manual snapshot on a production instance. Why are you doing this? What will happen if you revert back in time to when the snapshot was created? What happens to all the application transactions between creation and roll back?

It is OK if your backup software creates and deletes a snapshot. The snapshot should only be around for a short time. And a key part of your backup strategy will be to choose a time when the system has low usage to minimise further any impact on users and performance.

Caché database considerations for snapshots

Before the snapshot is taken the database must be quiesced so that all pending writes are committed and the database is in a consistent state. Caché provides methods and an API to commit then freeze (stop) writes to databases for the short period while the snapshot is created. This way only physical writes to the database files are frozen during the creation of the snapshot, allowing user processes to continue performing updates in memory uninterrupted. Once the snapshot has been triggered database writes are thawed and the backup continues copying data to backup media. The time between freeze and thaw should be very quick (a few seconds).

In addition to pausing writes, the Caché freeze also handles switching journal files and writing a backup marker to the journal. The journal file continues to be written normally while physical database writes are frozen. If the system were to crash while the physical database writes are frozen, data would be recovered from the journal as normal during start-up.

The following diagram shows freeze and thaw with VMware snapshot steps to create a backup with a consistent database image.

Note the short time between Freeze and thaw -- only the time to create the snapshot, not the the time to copy the read-only parent to the backup target.

Integrating Caché Freeze and Thaw

vSphere allows a script to be automatically called either side of snapshot creation, this is the time when Caché Freeze and Thaw is called. Note: For this functionality to work correctly the ESXi host requests the guest operating system to quiesce the disks via VMware Tools.

VMware tools must be installed in the guest operating system.

The scripts must adhere to strict name and location rules. File permissions must also be set. For VMware on Linux the script names are:

# /usr/sbin/pre-freeze-script
# /usr/sbin/post-thaw-script

Below are examples of freeze and thaw scripts our team use with Veeam backup for our internal test lab instances, but these scripts should also work with other solutions. These examples have been tested and used on vSphere 6 and Red Hat 7.

While these scripts can be used as examples and illustrate the method you must validate them for your own environments!

Example pre-freeze-script:

# Script called by VMWare immediately prior to snapshot for backup.
# Tested on Red Hat 7.2


echo >> $SNAPLOG
echo "`date`: Pre freeze script started" >> $SNAPLOG

# Only for running instances
for INST in `ccontrol qall 2>/dev/null | tail -n +3 | grep '^up' | cut -c5-  | awk '{print $1}'`; do

    echo "`date`: Attempting to freeze $INST" >> $SNAPLOG

    # Detailed instances specific log    

    # Freeze
    csession $INST -U '%SYS' "##Class(Backup.General).ExternalFreeze(\"$LOGFILE\",,,,,,1800)" >> $SNAPLOG $

    case $status in
        5) echo "`date`:   $INST IS FROZEN" >> $SNAPLOG
        3) echo "`date`:   $INST FREEZE FAILED" >> $SNAPLOG
           logger -p user.err "freeze of $INST failed"
        *) echo "`date`:   ERROR: Unknown status code: $status" >> $SNAPLOG
           logger -p user.err "ERROR when freezing $INST"
    echo "`date`:   Completed freeze of $INST" >> $SNAPLOG

echo "`date`: Pre freeze script finished" >> $SNAPLOG
exit $exit_code

Example thaw script:

# Script called by VMWare immediately after backup snapshot has been created
# Tested on Red Hat 7.2


echo >> $SNAPLOG
echo "`date`: Post thaw script started" >> $SNAPLOG

if [ -d "$LOGDIR" ]; then

    # Only for running instances    
    for INST in `ccontrol qall 2>/dev/null | tail -n +3 | grep '^up' | cut -c5-  | awk '{print $1}'`; do

        echo "`date`: Attempting to thaw $INST" >> $SNAPLOG

        # Detailed instances specific log

        # Thaw
        csession $INST -U%SYS "##Class(Backup.General).ExternalThaw(\"$LOGFILE\")" >> $SNAPLOG 2>&1

        case $status in
            5) echo "`date`:   $INST IS THAWED" >> $SNAPLOG
               csession $INST -U%SYS "##Class(Backup.General).ExternalSetHistory(\"$LOGFILE\")" >> $SNAPLOG$
            3) echo "`date`:   $INST THAW FAILED" >> $SNAPLOG
               logger -p user.err "thaw of $INST failed"
            *) echo "`date`:   ERROR: Unknown status code: $status" >> $SNAPLOG
               logger -p user.err "ERROR when thawing $INST"
        echo "`date`:   Completed thaw of $INST" >> $SNAPLOG

echo "`date`: Post thaw script finished" >> $SNAPLOG
exit $exit_code

Remember to set permissions:

# sudo chown root.root /usr/sbin/pre-freeze-script /usr/sbin/post-thaw-script
# sudo chmod 0700 /usr/sbin/pre-freeze-script /usr/sbin/post-thaw-script

Testing Freeze and Thaw

To test the scripts are running correctly you can manually run a snapshot on a VM and check the script output. The following screenshot shows the "Take VM Snapshot" dialog and options.

Deselect- "Snapshot the virtual machine's memory".

Select - "Quiesce guest file system (Needs VMware Tools installed)" check box to pause running processes on the guest operating system so that file system contents are in a known consistent state when you take the snapshot.

Important! After your test remember to delete the snapshot!!!!

If the quiesce flag is true, and the virtual machine is powered on when the snapshot is taken, VMware Tools is used to quiesce the file system in the virtual machine. Quiescing a file system is a process of bringing the on-disk data into a state suitable for backups. This process might include such operations as flushing dirty buffers from the operating system's in-memory cache to disk.

The following output shows the contents of the $SNAPSHOT log file set in the example freeze/thaw scripts above after running a backup that includes snapshot as part of its operation.

Wed Jan  4 16:30:35 EST 2017: Pre freeze script started
Wed Jan  4 16:30:35 EST 2017: Attempting to freeze H20152
Wed Jan  4 16:30:36 EST 2017:   H20152 IS FROZEN
Wed Jan  4 16:30:36 EST 2017:   Completed freeze of H20152
Wed Jan  4 16:30:36 EST 2017: Pre freeze script finished

Wed Jan  4 16:30:41 EST 2017: Post thaw script started
Wed Jan  4 16:30:41 EST 2017: Attempting to thaw H20152
Wed Jan  4 16:30:42 EST 2017:   H20152 IS THAWED
Wed Jan  4 16:30:42 EST 2017:   Completed thaw of H20152
Wed Jan  4 16:30:42 EST 2017: Post thaw script finished

This example shows 6 second elapsed time between freeze and thaw (16:30:36-16:30:42). User operations are NOT interrupted during this period. You will have to gather metrics from your own systems, but for some context this example is from a system running an application benchmark on a VM with no IO bottlenecks and an average of more than 2 million Glorefs/sec, 170,000 Gloupds/sec, and an average 1,100 physical reads/sec and 3,000 writes per write daemon cycle.

Remember that memory is not part of the snapshot, so on restarting the VM will reboot and recover. Database files will be consistent. You don’t want to "resume" a backup, you want the files at a known point in time. You can then roll forward journals and whatever other recovery steps are needed for application and transactional consistency once the files are recovered.

For additional data protection a journal switch can also be done by itself and journals backed up or replicated to another location, for example hourly.

Below is output of the $LOGFILE in the example freeze/thaw scripts above showing journal details for the snapshot.

01/04/2017 16:30:35: Backup.General.ExternalFreeze: Suspending system

Journal file switched to:
01/04/2017 16:30:35: Backup.General.ExternalFreeze: Start a journal restore for this backup with journal file: /trak/jnl/jrnpri/h20152/H20152_20170104.011

Journal marker set at
offset 197192 of /trak/jnl/jrnpri/h20152/H20152_20170104.011
01/04/2017 16:30:36: Backup.General.ExternalFreeze: System suspended
01/04/2017 16:30:41: Backup.General.ExternalThaw: Resuming system
01/04/2017 16:30:42: Backup.General.ExternalThaw: System resumed

VM Stun Times

At the creation point of a VM snapshot and after the backup is complete and the snapshot is committed the VM needs to be frozen for a short period. This short freeze is often referred to as stunning the VM. A good blog post on stun times is here. I summarise the details below and put them in context of Caché database considerations.

From the post on stun times: “To create a VM snapshot, the VM is “stunned” in order to (i) serialize device state to disk, and (ii) close the current running disk and create a snapshot point.…When consolidating, the VM is “stunned” in order to close the disks and put them in a state that is appropriate for consolidation.”

Stun time is typically a few 100 milliseconds; however, it is possible that if there is a very high disk write activity during the commit phase stun time could be several seconds.

If the VM is a Primary or Backup member participating in Caché Database Mirroring and the stun time is longer than the mirror Quality of Service (QoS) timeout the mirror may erroneously report the Primary VM as failed and initiate a mirror takeover.

For more information on Mirroring QoS see the documentation .

Strategies to keep stun time to a minimum include running backups when database activity is low and having well set up storage.

As noted above when creating a snapshot, there are several options you can specify, one of the options is to include the memory state in the snapshot - Remember memory state is NOT needed for Caché database backups. If the memory flag is set a dump of the internal state of the virtual machine is included in the snapshot. Memory snapshots take much longer to create. Memory snapshots are used to allow reversion to a running virtual machine state as it was when the snapshot was taken. This is NOT required for a database file backup.

When taking a memory snapshot, the entire state of the virtual machine will be stunned, stun time is variable.

As noted previously, for backups the quiesce flag must be set to true for manual snapshots or by the backup software to guarantee a consistent and usable backup.

Reviewing VMware logs for stun times

Starting from ESXi 5.0 snapshot stun times are logged in each virtual machine's log file (vmware.log) with messages similar to:

2017-01-04T22:15:58.846Z| vcpu-0| I125: Checkpoint_Unstun: vm stopped for 38123 us

Stun times are in microseconds, so in the above example 38123 us is 38123/1,000,000 seconds or 0.038 seconds.

To be sure that stun times are within acceptable limits, or to trouble-shoot if you suspect long stun times are causing problems you can download and review the vmware.log files from the folder of the VM that you are interested in. Once downloaded you can extract and sort the log, for example using the example Linux commands below.

Example downloading vmware.log files

There are several ways to download support logs including creating a VMware support bundle through the vSphere management console or from the ESXi host command line. Consult the VMware documentation for all the details, but below is a simple method to create and gather a much smaller support bundle that includes the vmware.log file so you can review stun times.

You will need the long name of the directory where the VM files are located. Log on to the ESXi host where the database VM is running using ssh and use the command: vim-cmd vmsvc/getallvms  to list vmx files and the long names unique associated with them.

For example the long name for the example database VM used in this post is output as:
26 vsan-tc2016-db1 [vsanDatastore] e2fe4e58-dbd1-5e79-e3e2-246e9613a6f0/vsan-tc2016-db1.vmx rhel7_64Guest vmx-11

Next run the command to gather and bundle only log files:
vm-support -a VirtualMachines:logs.

The command will echo the location of the support bundle, for example:
To see the files collected, check '/vmfs/volumes/datastore1 (3)/'.

You can now use sftp to transfer the file off the host for further processing and review.

In this example after uncompressing the support bundle navigate to the path corresponding to the database VMs long name. For example; in this case:
<bundle name>/vmfs/volumes/<host long name>/e2fe4e58-dbd1-5e79-e3e2-246e9613a6f0.

There you will see several numbered log files, the most recent log file has no number, i.e. vmware.log. The log may be only a few 100 KB but there is a lot of information, however we just care about the stun/unstun times, which are easy enough to find with grep. For example:

$ grep Unstun vmware.log
2017-01-04T21:30:19.662Z| vcpu-0| I125: Checkpoint_Unstun: vm stopped for 1091706 us
2017-01-04T22:15:58.846Z| vcpu-0| I125: Checkpoint_Unstun: vm stopped for 38123 us
2017-01-04T22:15:59.573Z| vcpu-0| I125: Checkpoint_Unstun: vm stopped for 298346 us
2017-01-04T22:16:03.672Z| vcpu-0| I125: Checkpoint_Unstun: vm stopped for 301099 us
2017-01-04T22:16:06.471Z| vcpu-0| I125: Checkpoint_Unstun: vm stopped for 341616 us
2017-01-04T22:16:24.813Z| vcpu-0| I125: Checkpoint_Unstun: vm stopped for 264392 us
2017-01-04T22:16:30.921Z| vcpu-0| I125: Checkpoint_Unstun: vm stopped for 221633 us

We can see two groups of stun times in the example, one from snapshot creation, and a second set 45 minutes later for each disk when the snapshot is deleted/consolidated (e.g. after the backup software has completed copying the read-only vmx file). In the above example, we can see that most stun times are sub-second, although the initial stun time is just over one second.

Short stun times are not noticeable to an end user. However, system processes such as Caché Database Mirroring continuously monitor whether an instance is ‘alive’. If the stun time exceeds the mirroring QoS timeout, then the node may be considered to be uncontactable and ‘dead’ and a failover will be triggered.

Tip: To review all the logs or for trouble-shooting a handy command is to grep all the vmware*.log files and look for any outliers or instances where stun time is approaching QoS timeout. The following command pipes the output to awk for fomating:

grep Unstun vmware* | awk '{ printf ("%'"'"'d", $8)} {print " ---" $0}' | sort -nr


You should monitor your system regularly during normal operations to understand stun times and how they may impact QoS timeout for HA such as mirroring. As noted previously strategies to keep stun/unstun time to a minimum include running backups when database and storage activity is low and having well set up storage. For constant monitoring logs may be processed by using VMware Log insight or other tools.

I will be revisiting backup and restore operations for InterSystems Data Platforms in future posts. But for now if you have any comments or suggestions based on workflows of your systems please share via the comments sections below.

Vote up!
Vote down!

Rating: 8

Comments: 7 Views: 1257


Hi Murray,
thank you for continuing your series.

Don't you think that VM image backup (despite of its importance) has a drawback as it may contain pretty huge amount of data that is unnecessary for simple database restoration? E.g., VM image may contain hundreds of gigabytes of journals useless for the database state in backup file. IMHO, in this case a kind of selective backup can be attractive. Not aware of Veeam, but sure that Acronis can do it on file system level. I wonder if selective external backup (e.g., in the case of Veeam) can be integrated with Cache DB freeze/thaw features with the same ease as a full one?

Vote up!
Vote down!

Rating: 0

Hi Alexey, good question. There is no one size fits all. My  aim is to highlight how external backups work so teams responsible can evaluate their best solution when talking to vendors. 

Third-party solutions will be a suite of management tools, not simply backup/restore, so there are many features to evaluate. For example, products that backup VMs will have features for change block tracking (CBT) so only changed blocks in the VM (not just changes to CACHE.DAT) are backed up. So incremental. But they also include many other features including replication, compression, deduplication, and data exclusion to manage what is backed up, when and what space is required. Snapshot solutions at the storage array level also have many similar functions. You can also create your own solutions integrating freeze/thaw, for example using LVM snapshots. 

Often a Caché application is only one of many applications and databases at a company. So usually the question is turned around to "can you backup <Caché Application x> with <vendor product y>". So now with knowledge of how to implement freeze/thaw you can advise the vendor of your Caché application requirements.


Vote up!
Vote down!

Rating: 0


To backup only selected files/filesystems on logical volumes (for example a filesystem on LVM2) the snapshot process and freeze/thaw scripts can still be used and would be just about the same.

As an example the sequence of events is:

  • Start process e.g. via script scheduled via cron
  • Freeze Caché via script as above.
  • Create snapshot volume(s) withlvcreate.
  • Thaw Caché via script as above.
  • mount snapshot filesystem(s) (for safety mount read only).
  • backup snapshot files/filesystems to somewhere else…
  • unmountsnapshot filesystem(s)
  • Remove snapshot volume(s) withlvremove

Assuming the above is scripted with appropriate error traps. This will work for virtual or physical systems.

There are many resources on the web for explaining LVM snapshots. A few key points are:

LVM snapshots use a different copy-on-write to VMware. VMware writes to the delta disk and merges the changes when the snapshot is deleted which has an impact that is managed but must be considered -- as explained above. For LVM snapshots at snapshot creation LVM creates a pool of blocks (the snapshot volume) which also contains a full copy of the LVM metadata of the volume. When writes happen to the main volume the block being overwritten is copied to this new pool on the snapshot volume and the new block is written to the main volume. So the more data that changes between when a snapshot was taken and the current state of the main volume, the more space will get consumed by that snapshot pool. So you must consider the data change rate in your planning. When an access comes for a specific block, LVM knows which block to access.

Like VMware, best practice for production systems is not to have multiple snapshots of the same volume, every time you write to a block in the main volume you potentially trigger writes in every single snapshot in the tree. For the same reason accessing a block can be slower.

Deleting a single snapshot is very fast. LVM just drops the snapshot pool.

Vote up!
Vote down!

Rating: 0


Addendum: I was reminded that there is an extra step when configuring external backups.

When the security configuration requires that the backup script supply Caché credentials, you can do this by redirecting input from a file containing the needed credentials. Alternatively, you can enable OS-level authentication and create a Caché account for the OS user running the script.

Please see the online documentation for full details

Vote up!
Vote down!

Rating: 0


To clarify and answer a question asked offline with an example; 

"Alternatively, you can enable OS-level authentication and create a Caché account for the OS user running the script."

Create a user for backup functions in the operating system named backup or similar, and add a user with same name in Caché.

Assign an appropriate role to the new Caché user based on your security requirements (for example you can test with %All role).

Then enable the system parameter to allow OS authentication.  Follow the steps in Configuring for Operating-System–based Authentication. (%Service_Terminal on Unix or %Service_Console for Windows).

The advantage of using a standard user name is you have a consistent approach for all instances.  

Vote up!
Vote down!

Rating: 0


Murray, did you mean that LVM snapshot volume is a sequential file in contrast to VMWare snapshot delta disk which is random access one? So, with the similar DB write load the snapshot size would be greater in case of LVM, won't it?

I tried LVM snapshot and backup with Caché freezing once upon a time. AFAIR, the space should be preliminary reserved inside LV for snapshot volume, and if it was too low the snapshot could fail. 

Vote up!
Vote down!

Rating: 0

Hi, LVM and VMware approach to snapshots are very different. But the way we interact with them with freeze/thaw is similar.

  • freeze Caché/snapshot (VM, LVM, array,etc)/thaw Caché/backup something.../etc

LVM presents a view of the volume (your data) at the instant of the snapshot, then you can copy (backup) all or selected files in that view somewhere else. If you look at the snapshot volume, filesystem for example with ls -l you will see all your files as they were back at the snapshot instant. In LVM2 the snapshot can be read/write which is why I say you should mount read only for backups. If you look at the parent you see your files as they are now. You must have unused space in the logical volume to allocate to the snapshot volume (created with the lvcreate command). Yes, if the snapshot volume fills up with changed data it will now be useless and is discarded. So you need to understand the data change rate at your chosen backup time. But a bit of testing should tell you that. There are videos and more help via $google.

Think of VMware server instance as just a bunch of files on the datastore disks which encapsulates the whole server including OS, drivers, Caché, data, etc, etc. The VMware delta disk is where all block changes are written since the snapshot started. The parent files are the VM at the instant of snapshot. You do not have a simple 'view back in time' capability that LVM has. The delta file(s) are written to the VM datastore, so there must be space for that too. But thats not so fiddly as LVM because you probably have a lot of spare space on the datastore -- but you still need to be sure to plan for that as well!

Which backup uses more space is really dependant on how long the snapshot hangs around. Snapshot files are the size of changed blocks. You want to delete snapshots as soon as the backup finishes to minimise space used either way. Smart VMware utilities that can do incremental backups through CBT will probably be quickest.

Vote up!
Vote down!

Rating: 0