Your sample works for me: 

USER>s rc = $zf(-100,"/SHELL","pwd")
/cachesys/mgr/user

USER>w $zv
Cache for UNIX (Ubuntu Server LTS for x86-64) 2017.2.2 (Build 865U) Mon Jun 25 2018 10:48:26 EDT

What Caché version are you using? $zf(-100) was added in 2017.2.1 (Build 801_3).  For older versions use $zf(-1 or -2), while upgrade to latest Caché (or even IRIS?) release would probably be the better choice.

The simplest way to interact from within bash with Caché looks like this: 

#!/bin/bash
...
csession TEST -U%SYS << EOF
set \$namespace="%SYS"
write ##class(Security.System).AutheEnabledGetStored("SYSTEM")
halt
EOF
... next line of your bash script ...

Output of Caché `write` and `zwrite` commands will go to STDOUT. As usual, you can redirect it wherever you want, e.g.

csession TEST -U%SYS << EOF >> /home/james/mysession.log

As parsing csession log can be a nasty task, I usually try to avoid it by construct:

#!/bin/bash
...
tf=/home/file.tmp
...
csession TEST -U%SYS << EOF > /dev/null
open $tf:("NWS"):1 use $tf
write ##class(Security.System).AutheEnabledGetStored("SYSTEM")
close $tf
halt
EOF
...
result=$(cat "$tf")

I use the similar code for the same purpose and it works. Here is my (debug) version has been written awhile ago: 

  ; In:
  ; pFirst - 1st journal file,
  ; pLast - last journal file,
  ; pSDB - source DB directory where journals were generated,
  ; pTDB - target DB directory where journals should be applied
  ;
  ; Sample:
  ; zn "test" set target=$zu(12,""), source=target ; start in target namespace assuming source == target
  ; k ^mtempfilt ; if you injected your ZJRNFILT
  ; d jrnrest^ztestFF("20190917.001","20190919.008",source,target)
  ; zwrite ^mtempfilt
  ; 
 
jrnrest(pFirst,pLast,pSDB,pTDB)

 new (pFirst,pLast,pSDB,pTDB) ;de!!!
 new $namespace $namespace="%SYS"
pLast=$g(pLast,pFirst)
pSDB=$g(pSDB,$zu(12)_"user\")
pTDB=$g(pTDB,$zu(12)_"test\")
RestOref=##class(Journal.Restore).%New()
RestOref.FirstFile=pFirst
RestOref.LastFile=pLast
RestOref.RollBack=0 ;? default = 1
!,pFirst_" to "_pLast_"; "_pSDB_" => "_pTDB_" OK?" ans#1 ! quit:"nN"[ans
sc=RestOref.RedirectDatabase(pSDB,pTDB) if 'sc jrnbad
sc=RestOref.SelectUpdates(pSDB) if 'sc jrnbad ; all globals; need it to fire RedirectDatabase
RestOref.Filter="^ZJRNFILT1" ; means nothing as ^ZJRNFILT is always used
t0=$zh
CHATTY=0 ; has no effect
sc=RestOref.Run() if 'sc jrnbad
t1=$zh
"sc="_sc_" dt="_$fn(t1-t0,"",3),!
jrnbad
 if $g(sc)'="",'sc $system.Status.DisplayError(sc)
 quit

Why do you think that ZJRNFILT doesn't work, maybe your journal files just don't contain any `set ^ABC(I)=I` command? After all, our guesses are easy to check. Just inject a couple of statements in your code: 

ZJRNFILT(jid,dir,glo,type,restmode,addr,time) /*Filter*/
  Set restmode=1                              /*Return 1 for restore*/
  If glo["ABC",type="S",$i(^mtempfilt("S"))
  If glo["ABC",type="K",$i(^mtempfilt("K")) Set restmode=0 /*except if a kill on ^ABC*/
  Quit

and check ^mtempfilt value after the journal restoration:

  zw ^mtempfilt

Some details are in “Suspending All Current Transactions” section of “Transaction Processing” chapter of documentation.

Looking through the docs, I'm getting curious if there is a discrepancy between its different parts:

1) Suspending All Current Transactions
You can use the TransactionsSuspended() method of the %SYSTEM.Process class to suspend all current transactions system-wide...

2) IRIS 2019.2 Class Reference. %SYSTEM.Process
The TransactionsSuspended(switch) class method controls a switch that will allow a process to temporarily suspend transactions.

I hope that the only second statement is true, isn't it?

System-wide setting of an environment variable is rather good solution sometimes, especially when Caché/IRIS is a single service of the given server. If was discussed a while ago along with some TZ issue with Caché, which was actual for those days versions:  https://community.intersystems.com/post/linux-tz-environment-variable-not-being-set-and-impact-cach%C3%A9

Subsrcript values are inserted into a varaible / global in a way that they are automatically sorted, first in canonical form and second (as I understand) in byte order, so stringy non canonical numbers will still be sorted, but they will apear after canonical numbers and before alpha characters.

The latter is not always true. A quick sample is:

 s MsgDTH="0.001005933", d=MsgDTH+10, b="1E1", c="1A1"
 s a(b)=b, a(c)=c, a(MsgDTH)=MsgDTH, a(d)=d zwrite a
 a(10.001005933)=10.001005933   // canonical number
 a("0.001005933")="0.001005933" // non-canonical number
 a("1A1")="1A1"                 // alpha-numeric string
 a("1E1")="1E1"                 // non-canonical number

Our nearest plan is to continue development in Cache having some kind of code convertor to IRIS. It would be nice if ISC would provide such utility by itself, at least as a starting point for further customisation.

Taking this approach we would not loose any functionality. We confess that we are restricting ourselves to the features available in both platforms this way, but it does not seem to be a huge price for the benefits of single code base.

 Katherine,

thank you for thorough explanation of possible connection problems. A question I have: you mentioned an internal method  SetTraceMask(). Can it help to trace problems which occur during the established SSH session? If so, which mask flags should one use?

Specifically, Caché SSH client sometimes closes the session during remote command long (and silent) execution implemented using Execute() method. It happens in local network, so we don't expect communication problems. We noticed that this most likely happens under high load of SSH client Caché instance, when several SSH sessions with different SSH servers are concurrently in use.

You'd hardly find plugin for Caché for any 3d party tool, while SNMP support included in all InterSystems products allows easy integration with most of them (Zabbix, Nagios, etc). It's even possible to write custom MIB to provide additional metrics, we tried it and it really worked.

Nowadays more lightweight approaches are getting popular, such as Grafana based ones mentioned by Evgeny. I'd personally recommend to use enterprise scale tool such as Zabbix if and only if you really need a solution to monitor all parts of your data center infrastructure (servers, OS, LAN, etc) besides DBMS; otherwise it would look like overkill, I'd prefer Grafana based one.

Both methods of calculation give the similar results in fast mode, while the %Library.GlobalEdit's one seems to be faster. The results of sizing of rather big global are:

< restart Caché >

USER>set t0=$zh
USER>set sc=##class(%Library.GlobalEdit).GetGlobalSize($zu(12,""),"zzz",.a,.u,1)
USER>zwrite sc,a,u write $zh-t0
sc=1
a=135819
22.542591

< restart Caché >

USER>set t0=$zh
USER>set as=$$AllocatedSize^%GSIZE("^zzz")/1024\1024
USER>zwrite as w $zh-t0
as=135818
28.29038

I withdrawing the comparison result as my testing environment was not stable (~ %30 "natural" fluctuations due to cloud hosting specific). Planning to repeat the test after getting more stable environment with such a large global(s).

It can be a sign of too long Write Daemon buffer queues on your DR Backup member. The reason for it can be insufficient throughput of its disk i/o subsystem and/or problems with disk i/o. I should start with looking through cconsole.log for messages indicating:

  • abnormal latency of disk access
  • Write Daemon non-responsive "more than 5 minutes" states
  • Write Daemon panic
  • and so on.

Next step would be start pButtons monitoring around the clock to indicate the guessed issue(s) more selectively. Looking at OS performance logs can be helpful as well.

Not touching "dot syntax" and argumentless "quit" in for what are the cases of using "quit" we have?

As to docs, "RETURN and QUIT differ when issued from within a FORDO WHILE, or WHILE flow-of-control structure, or a TRY or CATCH block." Besides these special cases, both commands work quite similar.

I didn't want to start discussing "RETURN vs QUIT", but if you insist... 

RETURN Pros:

  • visually clear sign of return from method/$$-function/do-subroutine call,
  • syntactically the same commands exist in many other programming languages,
  • due to ability to exit from any context without restrictions, RETURN can be accepted as a powerful feature for "alarm exit" on error.

RETURN Cons:

  • As to "alarm exit": QUIT combined with outer try / catch block does the same job (and even can do more of it).
  • It came too late; adding a new feature to the language which was kept unchanged for ages should have stronger reasons than it has (IMHO).
  • While substitution of QUITs with RETURNs in all appropriate places is not an easy task, the project code base may get a mixture of legal (and "legacy") QUITs and newer RETURNs, introduced by newcomers from other languages. I doubt if it would improve readability. BTW, to my surprise, I noticed RETURNs in one junior's class method, and never met it in middles' or seniors' code.
  • It can be important for some companies to support backward compatibility of their products with older and/or other versions of database management systems they use because it's not always possible to upgrade the customers to modern Caché / IRIS versions.
  • (funny one) Lazy developer has no option to use single-character shortcut for RETURN command. Three character word for one command is too long... wink