I use the similar code for the same purpose and it works. Here is my (debug) version has been written awhile ago: 

  ; In:
  ; pFirst - 1st journal file,
  ; pLast - last journal file,
  ; pSDB - source DB directory where journals were generated,
  ; pTDB - target DB directory where journals should be applied
  ;
  ; Sample:
  ; zn "test" set target=$zu(12,""), source=target ; start in target namespace assuming source == target
  ; k ^mtempfilt ; if you injected your ZJRNFILT
  ; d jrnrest^ztestFF("20190917.001","20190919.008",source,target)
  ; zwrite ^mtempfilt
  ; 
 
jrnrest(pFirst,pLast,pSDB,pTDB)

 new (pFirst,pLast,pSDB,pTDB) ;de!!!
 new $namespace $namespace="%SYS"
pLast=$g(pLast,pFirst)
pSDB=$g(pSDB,$zu(12)_"user\")
pTDB=$g(pTDB,$zu(12)_"test\")
RestOref=##class(Journal.Restore).%New()
RestOref.FirstFile=pFirst
RestOref.LastFile=pLast
RestOref.RollBack=0 ;? default = 1
!,pFirst_" to "_pLast_"; "_pSDB_" => "_pTDB_" OK?" ans#1 ! quit:"nN"[ans
sc=RestOref.RedirectDatabase(pSDB,pTDB) if 'sc jrnbad
sc=RestOref.SelectUpdates(pSDB) if 'sc jrnbad ; all globals; need it to fire RedirectDatabase
RestOref.Filter="^ZJRNFILT1" ; means nothing as ^ZJRNFILT is always used
t0=$zh
CHATTY=0 ; has no effect
sc=RestOref.Run() if 'sc jrnbad
t1=$zh
"sc="_sc_" dt="_$fn(t1-t0,"",3),!
jrnbad
 if $g(sc)'="",'sc $system.Status.DisplayError(sc)
 quit

Why do you think that ZJRNFILT doesn't work, maybe your journal files just don't contain any `set ^ABC(I)=I` command? After all, our guesses are easy to check. Just inject a couple of statements in your code: 

ZJRNFILT(jid,dir,glo,type,restmode,addr,time) /*Filter*/
  Set restmode=1                              /*Return 1 for restore*/
  If glo["ABC",type="S",$i(^mtempfilt("S"))
  If glo["ABC",type="K",$i(^mtempfilt("K")) Set restmode=0 /*except if a kill on ^ABC*/
  Quit

and check ^mtempfilt value after the journal restoration:

  zw ^mtempfilt

Some details are in “Suspending All Current Transactions” section of “Transaction Processing” chapter of documentation.

Looking through the docs, I'm getting curious if there is a discrepancy between its different parts:

1) Suspending All Current Transactions
You can use the TransactionsSuspended() method of the %SYSTEM.Process class to suspend all current transactions system-wide...

2) IRIS 2019.2 Class Reference. %SYSTEM.Process
The TransactionsSuspended(switch) class method controls a switch that will allow a process to temporarily suspend transactions.

I hope that the only second statement is true, isn't it?

System-wide setting of an environment variable is rather good solution sometimes, especially when Caché/IRIS is a single service of the given server. If was discussed a while ago along with some TZ issue with Caché, which was actual for those days versions:  https://community.intersystems.com/post/linux-tz-environment-variable-not-being-set-and-impact-cach%C3%A9

Subsrcript values are inserted into a varaible / global in a way that they are automatically sorted, first in canonical form and second (as I understand) in byte order, so stringy non canonical numbers will still be sorted, but they will apear after canonical numbers and before alpha characters.

The latter is not always true. A quick sample is:

 s MsgDTH="0.001005933", d=MsgDTH+10, b="1E1", c="1A1"
 s a(b)=b, a(c)=c, a(MsgDTH)=MsgDTH, a(d)=d zwrite a
 a(10.001005933)=10.001005933   // canonical number
 a("0.001005933")="0.001005933" // non-canonical number
 a("1A1")="1A1"                 // alpha-numeric string
 a("1E1")="1E1"                 // non-canonical number

Our nearest plan is to continue development in Cache having some kind of code convertor to IRIS. It would be nice if ISC would provide such utility by itself, at least as a starting point for further customisation.

Taking this approach we would not loose any functionality. We confess that we are restricting ourselves to the features available in both platforms this way, but it does not seem to be a huge price for the benefits of single code base.

 Katherine,

thank you for thorough explanation of possible connection problems. A question I have: you mentioned an internal method  SetTraceMask(). Can it help to trace problems which occur during the established SSH session? If so, which mask flags should one use?

Specifically, Caché SSH client sometimes closes the session during remote command long (and silent) execution implemented using Execute() method. It happens in local network, so we don't expect communication problems. We noticed that this most likely happens under high load of SSH client Caché instance, when several SSH sessions with different SSH servers are concurrently in use.

You'd hardly find plugin for Caché for any 3d party tool, while SNMP support included in all InterSystems products allows easy integration with most of them (Zabbix, Nagios, etc). It's even possible to write custom MIB to provide additional metrics, we tried it and it really worked.

Nowadays more lightweight approaches are getting popular, such as Grafana based ones mentioned by Evgeny. I'd personally recommend to use enterprise scale tool such as Zabbix if and only if you really need a solution to monitor all parts of your data center infrastructure (servers, OS, LAN, etc) besides DBMS; otherwise it would look like overkill, I'd prefer Grafana based one.

Both methods of calculation give the similar results in fast mode, while the %Library.GlobalEdit's one seems to be faster. The results of sizing of rather big global are:

< restart Caché >

USER>set t0=$zh
USER>set sc=##class(%Library.GlobalEdit).GetGlobalSize($zu(12,""),"zzz",.a,.u,1)
USER>zwrite sc,a,u write $zh-t0
sc=1
a=135819
22.542591

< restart Caché >

USER>set t0=$zh
USER>set as=$$AllocatedSize^%GSIZE("^zzz")/1024\1024
USER>zwrite as w $zh-t0
as=135818
28.29038

I withdrawing the comparison result as my testing environment was not stable (~ %30 "natural" fluctuations due to cloud hosting specific). Planning to repeat the test after getting more stable environment with such a large global(s).

It can be a sign of too long Write Daemon buffer queues on your DR Backup member. The reason for it can be insufficient throughput of its disk i/o subsystem and/or problems with disk i/o. I should start with looking through cconsole.log for messages indicating:

  • abnormal latency of disk access
  • Write Daemon non-responsive "more than 5 minutes" states
  • Write Daemon panic
  • and so on.

Next step would be start pButtons monitoring around the clock to indicate the guessed issue(s) more selectively. Looking at OS performance logs can be helpful as well.

Not touching "dot syntax" and argumentless "quit" in for what are the cases of using "quit" we have?

As to docs, "RETURN and QUIT differ when issued from within a FORDO WHILE, or WHILE flow-of-control structure, or a TRY or CATCH block." Besides these special cases, both commands work quite similar.

I didn't want to start discussing "RETURN vs QUIT", but if you insist... 

RETURN Pros:

  • visually clear sign of return from method/$$-function/do-subroutine call,
  • syntactically the same commands exist in many other programming languages,
  • due to ability to exit from any context without restrictions, RETURN can be accepted as a powerful feature for "alarm exit" on error.

RETURN Cons:

  • As to "alarm exit": QUIT combined with outer try / catch block does the same job (and even can do more of it).
  • It came too late; adding a new feature to the language which was kept unchanged for ages should have stronger reasons than it has (IMHO).
  • While substitution of QUITs with RETURNs in all appropriate places is not an easy task, the project code base may get a mixture of legal (and "legacy") QUITs and newer RETURNs, introduced by newcomers from other languages. I doubt if it would improve readability. BTW, to my surprise, I noticed RETURNs in one junior's class method, and never met it in middles' or seniors' code.
  • It can be important for some companies to support backward compatibility of their products with older and/or other versions of database management systems they use because it's not always possible to upgrade the customers to modern Caché / IRIS versions.
  • (funny one) Lazy developer has no option to use single-character shortcut for RETURN command. Three character word for one command is too long... wink

Is that supposed to be hard? I immediately visually determined the result 20

Vitaliy, you are apparently not a beginner.

the t3 method code is equivalent to the following code

No, because my version can return a non-empty value having been called as a function, while yours can't.

absolutely nothing will change fundamentally if you replace "return" with "quit"

Absolutely agree, and that was my initial point already published in recent discussion:

  • If "QUIT" was used correctly, and one replaced it with "RETURN", no miracle will happen with code readability.
  • In contrast, forgetting or being unaware of possible side effects can make code understanding harder.

The sample was inspired by those thoughts. Its initial version was a bit more tricky, but InterSystems discourage an "old syntax" even to be used in discussions, while IMHO even a beginner should be aware of it and its caveats.

As to RETURN, it seems that InterSystems promotes this command nowadays as more visually clear remedy to exit methods. Before its addition to Caché docs mentioned "Implicit QUIT", and now it tell us about "Implicit RETURN", while "Implicit QUIT" is still around. 

Evgeny,

Thank you for teaching me COS :)

Just a small note to finish it:

  • The rewritten sample is not functional equivalent of the original one and will return the different result (if correct the syntax error).
  • My sample was not about "old" syntax vs. "new" one; I just wanted to emphasize that simple substitute "quit" with "return" command either may or may not improve the readability of already existing class/routine, it all depends...

Evgeny, agree with you, there is the difference, while I'm not sure what percent of community members will give the right answer to the question below without looking into docs and running the code. AFAIK, voting is impossible here.

What result will be returned by the method:

ClassMethod t1()
 {
  do t2(.a,.b)
  do
  . set a=4,b=5
  . return $increment(a)*$increment(b)
  return $increment(a,-1)*$increment(b,-1)
t2(&pA,&pB) ;
  set pA=2,pB=3
  return pA*pB
 }

No, because as those

 quit sc

ever worked, they were used according to the rule cited above:

outside of a block structure or from within an IF, ELSEIF, or ELSE code block.

This is very personal, but I would never use RETURN, because it introduces backward incompatibility with older Caché versions without any real reason for it but using yet another "me too" "modern" feature. Besides, it provokes the developer to bypass modular approach rule that insists on having one and only one enter as well as exit from each functional module. Isn't it about real (rather than "me too") readability?

Again, that's only IMO, while I doubt if it's possible to issue any community accepted rule set for COS programming style.

Continuing Robert's list:

5) CACHETEMP is always local on APP servers (= ECP clients). As to our experience, it's very important feature, because it allows to keep processing of temporary data locally, without extra network and ECP server disk i/o workload. One of surprises of ECP I've got:  while it's relatively easy to achieve high speed of intra-ECP networking (as far as ~10Gbit/s hardware is available), ECP server disk i/o subsystem can easily become a bottleneck unless you accurately spread data processing among ECP clients.

6) Sergio, you wrote:

I mean, all the data would be stored on disk and will have to be synchronized through the net with the other APP servers.

If you really need to synchronize even temporary data, then simple horizontal scaling with ECP without some optimization of data processing  (see p.5) can be less cost-effective than comparable vertical scaling solution.