Journaling Overview - Configuration, Operations, and Utilities |

Article

Vic Sun · Feb 28, 2024 27m read

#Beginner #Best Practices #Journaling #System Administration #InterSystems IRIS

What is Journaling?

Journaling is a critical IRIS feature and a part of what makes IRIS a reliable database. While journaling is fundamental to IRIS, there are nuances, so I wrote this article to summarize (more briefly than our documentation which has all the details) what you need to know. I realize the irony of saying the 27 minute read is brief.

Every modification to a journaled database (sets and kills) is recorded with its timestamp in a journal file. This runs in parallel with writes to the databases and the write image journal (WIJ) for redundancy. IRIS uses the journals to replay database changes in a recovery scenario, to roll back transactions, and to synchronize databases for mirroring.

While this article will focus on what relies on journaling and the utilities used to manage journaling, we will also go through a brief overview of related IRIS operations to explain why journaling is so important. While you are welcome to reach out to the WRC if you need critical support with a journaling issue, having knowledge of the fundamentals can help you configure journaling correctly and respond to certain scenarios appropriately.

Before we begin, our journaling documentation is robust and can be found below. I at times will reference tables from the documentation rather than reproduce it inline. I am also linking to our journaling best practice recommendations - I highly recommend reviewing this document.

https://docs.intersystems.com/irislatest/csp/docbook/Doc.View.cls?KEY=GCDI_journal

https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=GCDI_journal#GCDI_journal_config_bestpract

The latest documentation (linked) at the time of writing this is currently for IRIS 2023.3 (CD), and the instance I referred to is running IRIS 2023.1.2 (EM).

Journaling Configuration

The system management portal is the easiest place to make journaling changes (System Administration > Configuration > System Configuration > Journal Settings), but outside of that there are routines that can be used to make journaling changes. They are primarily located in the %SYS namespace, and most of them are indexed by the ^JOURNAL routine which calls the other utilities.

Note: the system management portal will be referred to as SMP for brevity

Journaling can primarily be enabled at the system level and at the database level. By default, journaling is always enabled system wide. There are very few situations in which you would want to disable this so there is no SMP option; you will need to use the ^JRNSTOP routine. Stopping journaling for the system lasts until IRIS is restarted, at which point journaling will resume automatically. An error could also disable journaling system wide, eg. running out of journal disk space. If that occurs, you will see a messages.log message warning you that journaling has been disabled. You will need to use the ^JRNSTART routine to resume journaling.

To review journaling at the database level, you can go to the SMP (System Administration > Configuration > System Configuration > Local Databases). The default is for databases to be journaled.

By default, journal files' names have format yyyymmdd.nnn and are stored in <install_dir>/mgr/journal/. If journal compression is in use, the file name is suffixed with “z”. The journal.log file recording the name of each journal file is in /mgr/. The journal log will purge entries sequentially based on if the journal file itself has been purged and if the entry is 30 days or older. The journal.log is referenced by IRIS to switch and purge journals and can also be used to help replay journals. It should never be modified manually.

You can adjust some journaling settings using the SMP (System Administration > Configuration > System Configuration > Journal Settings) or by using the ^JRNOPTS routine which is option 7 of ^JOURNAL. Most of these settings are stored in the iris.cpf parameter file in the [config] and [journal] sections. Almost all of these can be changed without restarting IRIS, though they may incur a journal file switch.

https://docs.intersystems.com/irislatest/csp/docbook/Doc.View.cls?KEY=RCPF_config

https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=RACS_journal

The following settings are in both the SMP and ^JRNOPTS.

Primary and alternate journal directories - If an error occurs writing to the primary journal directory we will automatically switch to the other.
Journal file size limit - default to 1024 mb but can be adjusted up to 4079 mb.
Journal file name prefix
Journal file purge criteria - The default criterion are 2 days or 2 successful backups; if either threshold is reached the file will be purged when the journal purge task runs (daily). IRIS online backup will automatically report whether it is successful for this purpose. If you are using an external backup solution you can let IRIS know of a successful backup by calling this function: $$BACKUP^DBACK("","E").
Compress journal files - defaults to enabled. Non-active journal files will be compressed.

The SMP contains other settings not in ^JRNOPTS, including "Freeze on error" - understanding this setting is critical enough that I will devote the following section to discussing its implications. In short, our recommendation is to enable this setting.

There are additional settings available on the "Journal Settings" page that are less used such as "Journal Web Session," "Write image journal directory," and "Target size for the wij." Information on those settings is available in the CPF documentation; I won't be discussing them as they are situational.

There are also advanced journaling settings available in other locations in the SMP.

Note: as of 2023.3 a new journal archiving feature has been introduced such that compressed journals can be moved to a separate drive.

In System Administration > Configuration > Additional Settings > Advanced Memory you can modify the journal buffers (jrnbufs), which defaults to 64 mb but can range from 8 or 16mb to 1024mb depending on whether your install is 8-bit or Unicode. This setting controls the tradeoff between performance and reliability (journal data stored in memory/buffers is faster, but transient in case of a failure).

In System Administration > Configuration > Additional Settings > Compatibility we have the "SynchCommit" setting which determines when a TCOMMIT command will request data be on disk. True makes TCOMMIT wait for the journal to be written, but the default false will allow TCOMMIT to return before the write occurs. This relates to transactions which is covered a few sections below.

Journal Freeze on Error

The “Freeze on error” setting (System Administration > Configuration > System Configuration > Journal Settings) determines how IRIS will respond to journal I/O errors. The tradeoff you need to consider is availability weighed against data and transactional integrity.

This setting is disabled by default, in which case you can expect the following behavior if a journaling error occurs. The journal daemon will retry every 1s (default) until 150s (default) pass or the instance runs out of memory (global buffers) for the journaled updates. At that point journaling will be disabled system wide which prevents data after this point from being recoverable in a disaster scenario. Additional concerns with journaling disabled are that transaction rollback will fail, mirroring will fail, and ECP locking/transaction recoverability are compromised. When journaling is disabled due to an error, a message will be logged in the messages.log informing you that you will need to manually reenable journaling using ^JRNSTART. We would additionally recommend that if you have run with journaling disabled that you resolve the issue, switch the journal file, and backup the databases.

In general, we would recommend enabling “Freeze on error”, which means that if a journaling error occurs, IRIS will freeze to allow you to resolve the problem. All journaled global updates will stop, and global updates will freeze once the journal daemon is inactive for 30s. The journal daemon will keep retrying and unfreeze when it succeeds. This will likely hang IRIS until the journaling issue is resolved; you will see severity 3 messages in the messages.log notifying you when retrying the failed I/O.

To elaborate on the behavior with respect to transaction rollback, with "Freeze on error" disabled, a failed TROLLBACK will close the transaction and release the locks. With the setting enabled, the process opening the transaction will halt and the CLNDMN will repeatedly attempt to rollback the transaction. Locks will be held during this operation. If the CLNDMN is attempting to rollback a transaction for a dead job (as seen in the messages.log), you can use Manage^CLNDMN to manually close the transaction. The considerations here are similar to those discussed above: availability vs integrity.

Note: this only applies to local (non-ECP) rollback.

Data Integrity

Journaling is fundamental for disaster recovery. When you write to an IRIS database, IRIS writes those changes twice before ever touching the database itself. All sets and kills are recorded to the journal which is written to at least every 2 seconds (there are some other triggers which I will omit for simplicity). Changes are also written to the write image journal (WIJ) which stores database modifications for up to 80s before writing to the database in one pass. The combination of journals and the WIJ insure that if a crash occurs, we can double check what we have written across the journals, WIJ, and database.

This is a brief explanation; for more information our "Data Integrity Guide" is an excellent resource.

https://docs.intersystems.com/irislatest/csp/docbook/Doc.View.cls?KEY=GCDI

When IRIS starts up after an abnormal shutdown, IRIS will determine if it needs to reapply the WIJ or not and will replay journals (as journals are written more frequently, they may be more up-to-date). If you restore a backup, the journals can/should be applied next to replay the sequence of sets and kills and bring the database up to date. Without journaling, you can only restore to the point of the backup, whenever that was taken.

Transactions

Journals facilitate transactions as well. In short, transactions are used to ensure that a sequence of database changes can be treated as a single operation. You have probably already heard the ATM example: you do not want to withdraw money from an account but not have it reach its destination. If you put a sequence of sets and kills between the "TSTART" and "TCOMMIT" commands, with proper locking to handle multiple processes, you can be assured that if one of the changes is written to disk, all of the changes will have persisted. If an error or other event prevents the process from reaching the TCOMMIT, none of the changes within the transaction will be made. If you initiate a transaction but are unable to commit it due to an error or other event, IRIS can refer to the journal to undo all the database changes going back to the TSTART. In fact, due to the nature of transactions, even if a database is not journaled, a transaction will still be journaled. Beware that this only applies to runtime rollbacks, for a recovery startup a rollback will not occur.

A more in-depth explanation of transactions is available in our "Transaction Processing" documentation:

https://docs.intersystems.com/irislatest/csp/docbook/Doc.View.cls?KEY=GCOS_tp

Mirroring

Mirroring is InterSystems' high availability / disaster recovery / data replication solution. In mirroring, one instance of IRIS sends its journals to one or more other instances of IRIS which execute in sequence the same sets and kills (this process is called dejournaling) to keep their databases synchronized. This allows for logical data replication across nodes. You can never disable journaling on a mirrored database for this reason.

Upon becoming a primary mirror member, the journal files created will have "MIRROR" in the name and the journal log will be logged to a mirrorjrn-mirror_name.log as well as the regular journal.log. The other mirror members will store the mirror journals along with their own journals and will keep a copy of the mirror journal log; mirrorjrn-mirror_name.log is created in install-dir/mgr/.

There are other caveats to journaling and mirroring which I will try to touch on when relevant. Our mirroring documentation is here:

https://docs.intersystems.com/irislatest/csp/docbook/Doc.View.cls?KEY=GHA_mirror

Mirroring Caveats

This section is probably only interesting to you if you are using mirroring. It will touch on how the location of journal files, the purge criterion, and the "Freeze on error" setting are affected by mirroring.

Standard purging criterion for journal files are a certain number of days or backups. However, even if one of those triggers is met, in a mirror journal files may be preserved if they are needed for synchronization. For example, if a journal is needed to complete a transaction on any mirror member it will not be purged.

A primary failover member will only purge a journal file once local purge criterion have been met and the file has been received by all async and backup mirror members. This applies unless an async has been disconnected for >14 days.

A backup failover member or DR async will purge a journal once the file has been fully dejournaled, local purge criterion have been met, and the journal has been received by all asyncs. The same 14-day exception from above applies.

On a reporting async, mirror journals will be purged immediately after dejournaling by default. This is modifiable in mirror member settings (in CPF: AsyncUseSystemPurgeInterval).

The 14 day timeout is modifiable using the SYS.Mirror.JrnPurgeDefaultWait method, documented here:

https://docs.intersystems.com/irislatest/csp/documatic/%25CSP.Documatic.cls?LIBRARY=%25SYS&CLASSNAME=SYS.Mirror#JrnPurgeDefaultWait

The "Freeze on error" journal setting is automatically enabled for a primary mirror member, as it would not be able to continue to operate as a mirror member without journaling. If the freeze continues for long enough, the backup failover member should detect the inactivity and take over as primary.

When Would You Not Want to Journal

So now we have an idea of why journaling is used and that we generally would want it to be enabled. But are there some situations in which you wouldn't want to journal? For that I will defer to this brief series of articles written by Tani Frankel for Caché but applicable to IRIS, covering the following topics:

How to determine what is causing unusual journaling activity using the journal profile or ^JRNDUMP.
A discussion of what is journaled and why - he also covers disaster recovery and transactions. If you do not need to recover the data and aren't using transactions it may be reasonable to not journal.
Methods of preventing journaling - system wide (^JRNSTOP), CACHETEMP (now IRISTEMP), mapping to non-journaled databases, process-level (^%NOJRN), and when filing an object.

https://community.intersystems.com/post/what-causing-journals-grow-rapidly

https://community.intersystems.com/post/my-growing-journals-how-do-i-minimize

https://community.intersystems.com/post/preventing-globals-getting-journaled-continued-how-do-i-minimize-my-journals

Journaling Operation Tasks

In this chapter, I will be going over some common journaling operations you may need to perform. A more exhaustive list of journaling utilities will be reviewed in the third chapter, "Journaling Utilities."

Starting and stopping journaling at the instance level is not available through the system management portal. You can only make that change through ^JOURNAL or by using ^JRNSTART and ^JRNSTOP.

The main Journals page in the management portal is located at System Operation > Journals.

To switch journal directories there is the "Switch Directory" button on the Journals SMP page, and SWDIR^JOURNAL which is option 13 in ^JOURNAL. If you don't have another directory defined this option is unavailable.

To switch to a new journal file, you can use the "Switch Journal" button on the Journals SMP page, or use ^JRNSWTCH which is option 3 in ^JOURNAL. Journal files will automatically change after a successful backup, if a journal file becomes full, if the journal directory becomes unavailable, and when you change journal settings.

If you want to review a journal file’s contents, you can do so in the system management portal (System Operation > Journals) or by using ^JRNDUMP. SELECT^JRNDUMP will allow you to dump specific journal records to a file, filtering by a variety of criteria.

Note: if you need to look at an arbitrary journal file, you can change the URL of this SMP page to point to another file. This assumes IRIS has permissions to access that file.

To get a general overview of the contents of a journal, you may want to profile the journal using the "Profile" option on the Journals SMP page. This will give you an idea of what globals are being changed - how often and in what databases. You can also see a summary of the journal file's metadata using the "Summary" option on the Journals page. This contains information such as what databases are involved, whether they are encrypted, and when the journal file was created. The Journals page also offers the ability to run an integrity check on a journal file.

If you need to manually purge your journals, for example if some unusual database activity has caused excessive journaling, we allow you to do so using PURGE^JOURNAL, the 6^th ^JOURNAL option. This allows you to either purge based on your standard criterion, or to purge all journals not needed for transaction or crash recovery. The default "Purge Journal" task (System Operation > Task Manager >Task Schedule) runs at 00:30 shortly after the default 00:00 "Switch Journal" task. If a purge fails, ex. the journal file is required for a transaction or for mirroring, you can expect a messages.log message to let you know this. You should not attempt to delete journal files from the OS level as this can cause filesystem misalignment with the journal.log and break standard journaling operations.

Finally, you may run into a situation where running a manual journal restore is necessary. The exact mechanisms are discussed more in the ^JRNRESTO section, but the basics are as follows. ^JRNRESTO is option 4 of ^JOURNAL, and you want to ensure that users are not on the system, that journaling is stopped, and that you have restored your backup. You can then run ^JRNRESTO and restart journaling. This operation is not available in the system management portal as it is used only in extenuating circumstances.

Advanced Access to Journaling Information

For more advanced usage such as certain troubleshooting or programmatic needs, the ^%SYS("JOURNAL") global contains detailed journaling information. The %SYS.Journal.System contains journaling methods and queries as documented here:

https://docs.intersystems.com/irislatest/csp/documatic/%25CSP.Documatic.cls?LIBRARY=%25SYS&CLASSNAME=%25SYS.Journal.System

Journaling Utilities:

^JOURNAL

In the prior two sections I have made some references to journaling utilities. Many of these are aggregated within ^JOURNAL, a routine available in the %SYS namespace. This section will go into more detail than the previous ones, and will give specific examples of prompt paths. You may find this more useful as a reference than as an article to read.

The interface will offer you these choices:

1) Begin Journaling (^JRNSTART)

2) Stop Journaling (^JRNSTOP)

3) Switch Journal File (^JRNSWTCH)

4) Restore Globals From Journal (^JRNRESTO)

5) Display Journal File (^JRNDUMP)

6) Purge Journal Files (PURGE^JOURNAL)

7) Edit Journal Properties (^JRNOPTS)

8) Activate or Deactivate Journal Encryption (ENCRYPT^JOURNAL())

9) Display Journal status (Status^JOURNAL)

10) -not available-

11) -not available-

12) Journal catch-up for mirrored databases (MirrorCatchup^JRNRESTO)

13) Switch Journaling to Secondary Directory (SWDIR^JOURNAL)

I will describe these options in the order of how much detail I will provide, from simplest to most complex. Note that some options (such as 10 and 11 above) may appear as "-not available-" depending on your configuration; I will note this.

Options 1 "Begin Journaling (^JRNSTART)", 2 " Stop Journaling (^JRNSTOP)", 3 " Switch Journal File (^JRNSWTCH)", and 13 "Switch Journaling to Secondary Directory (SWDIR^JOURNAL)" are straightforward as they perform a single operation. Option 13 will be unavailable if the instance does not have an alternate journal directory configured.

Option 12 "Journal catch-up for mirrored databases (MirrorCatchup^JRNRESTO)" is also a basic operation for a mirrored instance, and will initiate mirror catchup. Journaling does not need to be enabled at the time, but it must have been started at least once since IRIS startup to have the current journal directory in memory

Option 7 "Edit Journal Properties (^JRNOPTS)" allows you to modify 5 basic journaling configuration settings. You can change the journal directories, the journal file size limit, the journal file prefix, and your purging criterion.

Option 10 offers "Cluster Journal Restore (CLUMENU^JRNRESTO)". ECP cluster considerations are beyond what I will be covering in this article, but you can find documentation on this option here in the Caché documentation:

https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY=GCDI_cluster_journal#GCDI_cluster_tools_cjr

Option 8 is "Activate or Deactivate Journal Encryption (ENCRYPT^JOURNAL())", and a deep description will be omitted. I will note that this setting change applies to future journal files. To set up encrypted journal files see this documentation:

https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=ROARS_encrypt_dbmgmt#ROARS_encrypt_dbmgmt_startup_ejf

Option 6 "Purge Journal Files (PURGE^JOURNAL)" allows you to purge with two separate options: you can purge based on your configured criterion (days and successful backups) or purge any journal that is not needed for transaction rollback or crash recovery. It will report what journal files were purged.

Option 9 "Display Journal status (Status^JOURNAL)" provides some journaling meta-information. It will tell you about the current state of journaling, including:

The journal directories and how much space they have remaining.
The current journal file, its max size, and how full it currently is.
If journaling is enabled or disabled, and if disabled why. Note that if it is disabled due to an I/O error, if the status reports frozen, the journal data will be discarded.
If applicable, it will contain the process IDs of a process running ^JRNSTART, ^JRNSTOP, or ^JRNSWTCH.

Options 11, "Manage pending or in progress transaction rollback (Manage^JRNROLL)", 5 "Display Journal File (^JRNDUMP)", and 4 "Restore Globals From Journal (^JRNRESTO)" are involved enough that I will separate them with their own headers.

^JOURNAL Option 11: Manage^JRNROLL

In general transaction rollback should be quick as transactions should only be open for very brief windows, but if a transaction has been left open for an extended period the instance may have to scan a significant number of journal files prior to normal system startup. We scan for open transactions on instance startup and when becoming a primary mirror member.

If system uptime is critical, Manage^JRNROLL can suspend the rollback process temporarily. In general, I would caution against using this utility as you will almost certainly be sacrificing transactional integrity. You will need to weigh whether uptime is important enough that you can make that tradeoff, and you must also be prepared to take whatever action is necessary to resolve the state of the data from not properly resolving the transactions.

I can't discuss every possible scenario here but I would encourage you to contact the WRC for assistance if you think you may need to use this.

Manage^JRNROLL additionally allows you to monitor the rollback process. It will tell you if the instance is scanning the journal files or actually performing the rollback, how many MB of journals remain to be processed, and how many open transactions were found. Another option to monitor the rollback is the messages.log. In modern versions you can expect messages.log messages reporting on the amount of data remaining to be processed and the number of open transaction.

^JRNRESTO

After applying a backup, ^JRNRESTO is the utility used to apply the journal files from the time of the backup to the present. This is called "dejournaling." ^JRNRESTO will only affect databases that are set to journal when you run the routine.

^JRNRESTO offers several capabilities for refining what journals to restore. Primarily, you can specify a range of journal files. You can also select particular databases or globals to be updated. You can also select individual databases or globals to restore. For more granularity you can make use of a custom journal filter from ^ZJRNFILT, which we cover later. You can also restore to either the current instance, or to the databases of another IRIS instance. ^JRNRESTO can also be used to initiate mirror catchup if you are trying to restore mirror journal finals to a mirrored database in the same mirror, in which case it will simply use MirrorCatchup^JRNRESTO.

^JRNRESTO Performance

^JRNRESTO allows you to disable journaling for the restore updates which will improve performance. To further increase the speed of dejournaling, parallel dejournaling can make use of up to 4 jobs to update multiple databases simultaneously. To make use of this you would need at least 8 CPUs and sufficient shared memory heap configured (gmheap). Each parallel job you want to use requires 200mb of gmheap, meaning a minimum of 400mb of gmheap is required to use parallel dejournaling. Even if you don't reach that threshold, increasing the shared memory heap can improve performance. Modifying the gmheap setting can be done in the management portal (System Administration > Configuration > Additional Settings > Advanced Memory) and requires a restart. If the configuration allows, by default IRIS will use parallel dejournaling.

Using ^JRNRESTO

You can initiate journal restore by switching to the %SYS namespace and entering "do ^JRNRESTO". The best way to familiarize yourself with this utility is to test it yourself, as you may have a better idea of what kind of journal restores may be necessary in your environment. Example prompts and sample output are available in the documentation and I will outline the steps in the process so you have an idea of what to expect.

If the instance is in a mirror, the first prompt will ask if you want to "Catch-up mirrored databases? No =>". Your response here will determine if you are redirected to MirrorCatchup^JRNRESTO or if you will proceed with further prompts to manually specify what journals to restore to which databases.

If you have defined a journal filter (ZJRNFILT, discussed below), you will be prompted on whether to use that filter or not. MARKER^ZJRNFILT is not applicable to the vast majority of use cases:

"Use current journal filter (ZJRNFILT)?

Use journal marker filter (MARKER^ZJRNFILT)?"

After this you can specify whether to "Process all journaled globals in all directories?" or not. If you do not want to restore everything, you will need to answer a series of prompts to declare what exactly you want to restore. You will be able to choose databases (even for a different instance of IRIS) and globals on a per database basis.

Identifying Databases for ^JRNRESTO

The first clarification requested will be "Are journal files imported from a different operating system?" This will determine whether ^JRNRESTO can use the default OS' directory path format.

The next step will be to specify the source database directory path at the "Directory to restore [? for help]:" prompt. If you are restoring mirrored journals to a non-mirrored database, you can also use the full mirror name here (ex. :mirror:<mirror_config_name>:<mirror_db_name>). The following prompt is "Redirect to directory" to select the target database. You can choose the default and hit <enter> to select the source as the target.

For each database whether to restore all or specific globals.

"Process all globals in <directory_to_db>? No =>"

Once you have entered all the databases you want to restore, you can hit <enter> at "Directory to restore [? for help]:" in order to proceed. You will be asked to confirm your selections.

NOTE: If you are restoring multiple source databases to one target database, you must either make the same selection of globals or restore all globals.

Identifying Journal Files for ^JRNRESTO

^JRNRESTO tries to make restoring a sequence of journal files as easy as possible. Whether you do any database specification above or are restoring all entries, the first prompt would be "Are journal files created by this IRIS instance and located in their original paths? (Uses journal.log to locate journals)?" If you select "yes" ^JRNRESTO will be able to generate a list of the journal files available from the journal.log rather than requiring the user to manually direct the process to the journal files. You can also select "no" which will ask you if you have a copy of the journal.log from the source IRIS instance, and then to point to the directory containing the journal files that you want to restore.

In either case you will be given the following prompt where you can view a list of journal files and then make your selection. Backups initiate a journal switch and log a message (if you initiate an external freeze for an external backup this will be recorded in the messages.log) which can guide you on what journals you want to restore:

"Specify range of files to process

Enter ? for a list of journal files to select the first and last files from

First file to process:"

If you do not have a journal.log file to refer to, you will still be able to provide journal file names and a directory to look for files.

IRIS will let you review the files you have selected and then will ask if you want to check journal file integrity. If you are attempting to restore an active journal file, IRIS will require and prompt you to switch the file.

Miscellaneous Other Options for ^JRNRESTO

As described above, parallel dejournaling may be used to improve performance. If you are using parallel dejournaling, the process will not be journaled. If you are not using parallel dejournaling, you will still be offered the option of disabling journaling of these updates which will improve performance.

The final choice you will make before beginning the journal restore would be how to respond to errors. You can select responses to both database and journal level errors; you can determine whether you want the process to continue or to abort if it runs into either kind of error. If you choose to abort in either case, parallel dejournaling cannot be used.

The process will then begin reporting on its progress, informing you of what journal file is being processed and periodically the % completion. Once the restore completes a summary of the restore will be shown.

Note: When you restore journals, open transactions will be rolled back. You can insure that you won't have transaction integrity concerns by rolling back or committing your transactions. If you have concerns about user activity you can restart IRIS in single-user mode as documented here:

https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=ASECMGMT#ASECMGMT_emerg

^ZJRNFILT

You can use ^ZJRNFILT to create a custom filter to use during journal restore, controlling what sets and kills to apply. You will have to create a custom routine accepting the following parameters.

ZJRNFILT(jid,dir,glo,type,restmode,addr,time)

jid - job ID to identify the PID that generated the journal

dir - full path to directory containing the IRIS.DAT to be restored, specified in journal record

glo - global in journal record

type - command type as specified in documentation here:

https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=GCDI_journal#GCDI_journal_type_tbl

restmode - you will set this to 1 or 0 in your custom code to determine if a record should be restored

addr - address of the journal record

time - timestamp of the record ($horolog format). Note this is the time the journal buffer is created, not when the actual set/kill occurs.

Examples of writing a ^ZJRNFILT are available in the documentation. If the custom ^ZJRNFILT routine exists, journal restore will prompt you to use the filter. If you use it, the filter will be called on each journal record. This will use the logic you write in the routine, involving whatever parameters you want to use, to set restmode to 1 (apply) or 0 (do not apply).

Special Considerations

If the startup process ^STU calls ^JRNRESTO, it will not apply your custom ZJRNFILT filter.

When a journal restore completes, you will be asked to rename or delete the ^ZJRNFILT. If you rename the routine, it will copy the routine to ^XJRNFILT and delete ^ZJRNFILT.

If your ^ZJRNFILT encounters an error, the restore process will abort.

^JRNDUMP

Sometimes the management portal may be insufficient to review the journal files. This may be because the journal file is so large that you encounter a timeout. In those cases, you can use either ^JRNDUMP or SELECT^JRNDUMP to dump the journal records for ease of viewing.

^JRNDUMP will display a selection of journal files and the directory where they can be found. You can use the following options to navigate that display: "Pg(D)n,Pg(U)p,(N)ext,(P)rev". You can also use "(G)oto" to directly name a journal file to investigate.

The "(I)nfo" option will provide metadata for a journal file including the file GUID, max size, time created, file count, min trans, prev file, prev file GUID, prev file end, next file, next file GUID. Most of these entries are easily understandable, however "Min Trans" contains the journal file and offset at which we know all transactions are closed. After that point there may be open transactions. From here, "(D)atabases" will tell you what databases are affected by that journal file. All the "(I)nfo" information is also available in the "Journal File Summary" page in the system management portal.

"(E)xamine" will display individual journal entries with their address, process ID, operation, directory, global, and value. You can use (N)ext, (P)rev, (G)oto, and (F)ind to navigate these journal addresses. You can further "(E)xamine" to look in-depth at a single entry, which will provide nearly the same information available in the "View Journal" page of the system management portal. This would be the address, type of operation, whether the operation was in a transaction, job ID, process ID, ECP system ID (0 if not configured for ECP), Time stamp, collation, previous and next address, as well as the global and its new value. If the entry was written in a transaction, the old value will also be listed.

A full listing of the different operations/types is available in the documentation (linked directly above in the ZJRNFILT section).

There are three main types of journal records: data records, journal markers, and journal headers. ^JRNDUMP will not show journal headers and journal markers are indicated by Type: JrnMark. Journal markers are used for various system operations such as backups. Almost all entries you will see will be data records.

SELECT^JRNDUMP

SELECT^JRNDUMP allows you to dump filtered journal entries to either the terminal or to a file. You can filter based on the following criteria:

SELECT^JRNDUMP(%jfile,%pid,%dir,%glo,%gloall,%operation,%remsysid)

%jfile - full path to a journal file, default to the current file

%pid - process ID, default to any

%dir - directory of a database, default to any

%glo - global reference, default to any

%gloall - modifier for %glo. If this is 0, the filter will look for exactly the global name specified by %glo. If this is 1, the filter will catch global names that contain the %glo parameter

%operation - operation type, default to any

%remsysid - ECP system ID, default to any

Some examples of specifying filters for SELECT^JRNDUMP are available in the documentation.

^STURECOV

If IRIS startup runs into a journaling or transaction restore error you may be placed into single-user mode. You will see a messages.log message like this:

12/28/18-07:06:43:099 (1234) 1 1 errors during journal restore,

see messages.log file for details.

Startup aborted, entering single user mode.

Enter IRIS' with

iris session IRIS -B

and Do ^STURECOV for help recovering from the errors.

Depending on the error, ^STURECOV offers different options to help you recover.

Mirror note: ^STURECOV will not be able to run on a mirrored database if transaction rollback is in progress, as the database will not mount read/write until the rollback completes. The above ^JOURNAL Option 11: Manage^JRNROLL covers what you should do in that scenario.

For a journaling error specifically, ^STURECOV's menu will look like this:

1) Display the list of errors from startup

2) Run the journal restore again

3) Bring down the system prior to a normal startup

4) Dismount a database

5) Mount a database

6) Database Repair Utility

7) Check Database Integrity

8) Reset system so journal is not restored at startup

9) Display instructions on how to shut down the system >>>>> [Unix only]

10) Display Journaling Menu (^JOURNAL)

Warning: Option 8 is irreversible and will prevent the next startup from performing journal restore. From that point, if you want to restore journals you would need to use ^JRNRESTO. This should generally only be used if there is a need to circumvent recovery and bring IRIS up.

^JRNMARK

This utility can be used to add a journal marker to the journal file - you likely will not need to use this. You use it as follows:

SET rc=$$ADD^JRNMARK(id,text)

"rc" will return the journal offset and journal file name. id can be any integer value (invalid entries default to 0) and text can be any string up to 256 characters.

You will be able to see the journal marker in the journal file, however if you want to review the text or the id you will need to use ^JRNDUMP; the system management portal does not provide that functionality.

^JRNUTIL

^JRNUTIL includes various tags that allow for the manipulation of journal files programmatically. This allows you to open, delete, or read from a journal file, among other options.

If you are interested in using these functions you should probably be talking to somebody from InterSystems directly.

^JCONVERT and ^%JREAD

These utilities are only needed for conversion from DSM to Caché. They allow you to convert journal files to a common format. An example is available in the documentation, but this usage is specific enough that I won't cover it.

TLDR

Just read the first section and the best practices documentation. Hopefully you found this journaling overview to be useful - it was written by a human (system admin) for a human. Feel free to comment or leave a question below, or you can direct message me.