Hi James,

Agree with you, parsing terminal output is not the smartest solution. I always try to avoid it using intermediate files. E.g. (from real life):

rm -f $mydir/db_temp
NspToCheck=$nspace

csession $instance -U%SYS << EOF > /dev/null
set fn="$mydir/db_temp"
o fn:("WNS"):1 u fn
try {
  zn "$NspToCheck"
  w \$s(\$zu(5)=\$zcvt("$NspToCheck","U"):\$zutil(12,""),1:"$NspToCheck")
} catch {
  w \$p(\$ze,">")_">"
}
c fn
h
EOF

DbQms=$(cat "$mydir/db_temp")

Here the default DB of the namespace $NspToCheck (or $ZError code) is written to $mydir/db_temp file, than it goes to $DbQms shell variable and processed as needed.

Initial answer was amended.

Your sample works for me: 

USER>s rc = $zf(-100,"/SHELL","pwd")
/cachesys/mgr/user

USER>w $zv
Cache for UNIX (Ubuntu Server LTS for x86-64) 2017.2.2 (Build 865U) Mon Jun 25 2018 10:48:26 EDT

What Caché version are you using? $zf(-100) was added in 2017.2.1 (Build 801_3).  For older versions use $zf(-1 or -2), while upgrade to latest Caché (or even IRIS?) release would probably be the better choice.

I use the similar code for the same purpose and it works. Here is my (debug) version has been written awhile ago: 

  ; In:
  ; pFirst - 1st journal file,
  ; pLast - last journal file,
  ; pSDB - source DB directory where journals were generated,
  ; pTDB - target DB directory where journals should be applied
  ;
  ; Sample:
  ; zn "test" set target=$zu(12,""), source=target ; start in target namespace assuming source == target
  ; k ^mtempfilt ; if you injected your ZJRNFILT
  ; d jrnrest^ztestFF("20190917.001","20190919.008",source,target)
  ; zwrite ^mtempfilt
  ; 
 
jrnrest(pFirst,pLast,pSDB,pTDB)

 new (pFirst,pLast,pSDB,pTDB) ;de!!!
 new $namespace $namespace="%SYS"
pLast=$g(pLast,pFirst)
pSDB=$g(pSDB,$zu(12)_"user\")
pTDB=$g(pTDB,$zu(12)_"test\")
RestOref=##class(Journal.Restore).%New()
RestOref.FirstFile=pFirst
RestOref.LastFile=pLast
RestOref.RollBack=0 ;? default = 1
!,pFirst_" to "_pLast_"; "_pSDB_" => "_pTDB_" OK?" ans#1 ! quit:"nN"[ans
sc=RestOref.RedirectDatabase(pSDB,pTDB) if 'sc jrnbad
sc=RestOref.SelectUpdates(pSDB) if 'sc jrnbad ; all globals; need it to fire RedirectDatabase
RestOref.Filter="^ZJRNFILT1" ; means nothing as ^ZJRNFILT is always used
t0=$zh
CHATTY=0 ; has no effect
sc=RestOref.Run() if 'sc jrnbad
t1=$zh
"sc="_sc_" dt="_$fn(t1-t0,"",3),!
jrnbad
 if $g(sc)'="",'sc $system.Status.DisplayError(sc)
 quit

Why do you think that ZJRNFILT doesn't work, maybe your journal files just don't contain any `set ^ABC(I)=I` command? After all, our guesses are easy to check. Just inject a couple of statements in your code: 

ZJRNFILT(jid,dir,glo,type,restmode,addr,time) /*Filter*/
  Set restmode=1                              /*Return 1 for restore*/
  If glo["ABC",type="S",$i(^mtempfilt("S"))
  If glo["ABC",type="K",$i(^mtempfilt("K")) Set restmode=0 /*except if a kill on ^ABC*/
  Quit

and check ^mtempfilt value after the journal restoration:

  zw ^mtempfilt

Our nearest plan is to continue development in Cache having some kind of code convertor to IRIS. It would be nice if ISC would provide such utility by itself, at least as a starting point for further customisation.

Taking this approach we would not loose any functionality. We confess that we are restricting ourselves to the features available in both platforms this way, but it does not seem to be a huge price for the benefits of single code base.

You'd hardly find plugin for Caché for any 3d party tool, while SNMP support included in all InterSystems products allows easy integration with most of them (Zabbix, Nagios, etc). It's even possible to write custom MIB to provide additional metrics, we tried it and it really worked.

Nowadays more lightweight approaches are getting popular, such as Grafana based ones mentioned by Evgeny. I'd personally recommend to use enterprise scale tool such as Zabbix if and only if you really need a solution to monitor all parts of your data center infrastructure (servers, OS, LAN, etc) besides DBMS; otherwise it would look like overkill, I'd prefer Grafana based one.

It can be a sign of too long Write Daemon buffer queues on your DR Backup member. The reason for it can be insufficient throughput of its disk i/o subsystem and/or problems with disk i/o. I should start with looking through cconsole.log for messages indicating:

  • abnormal latency of disk access
  • Write Daemon non-responsive "more than 5 minutes" states
  • Write Daemon panic
  • and so on.

Next step would be start pButtons monitoring around the clock to indicate the guessed issue(s) more selectively. Looking at OS performance logs can be helpful as well.

Continuing Robert's list:

5) CACHETEMP is always local on APP servers (= ECP clients). As to our experience, it's very important feature, because it allows to keep processing of temporary data locally, without extra network and ECP server disk i/o workload. One of surprises of ECP I've got:  while it's relatively easy to achieve high speed of intra-ECP networking (as far as ~10Gbit/s hardware is available), ECP server disk i/o subsystem can easily become a bottleneck unless you accurately spread data processing among ECP clients.

6) Sergio, you wrote:

I mean, all the data would be stored on disk and will have to be synchronized through the net with the other APP servers.

If you really need to synchronize even temporary data, then simple horizontal scaling with ECP without some optimization of data processing  (see p.5) can be less cost-effective than comparable vertical scaling solution.

This is well-known classic approach how asynchronous behavior can be simulated in "synchronous only" language. It potentially suffers of two problems caused by the need to check if something has dropped to TCP connection:

1) Read x:timeOut (where timeOut>0) causes delays up to timeOut seconds which are more or less acceptable for background job but not acceptable for foreground (e.g., some UI handling).

2) Read x:0 is too unreliable.

Agree with Dmitry: it helps that "not all those tasks have to be asynchronous".

A customer of ours is running a production system of two mirrored DB servers and 13 APP ones, running 24x7 with possible downtime <= 2 hours once a month.

Summary production DB size is ~ 4TB. Full backup takes ~ 10 hours, its restore ~ 16 hours. Therefore incremental backups are scheduled nightly, and full backup - weekly; to avoid performance drawback, Mirror Backup server is assigned to run all this activity.

As to integrity check, it turned to be useless since the DB size had overgrown 2TB: weekend had become not enough to run it through, so it was disabled. It doesn't seem to be a great problem because:

  • Enterprise grade SANs are very reliable, the probability of errors is very low.
  • Let's imagine that integrity error occurred. Which database blocks would be most probably involved? Those that are active changed by users' processes, rather than those that reside in "stale" part of the database. So, the errors would most likely appear on application level sooner than they might be caught with Integrity Check. What, do you think, would the administrator do in case of such kind of errors? Stop production for several days, running Integrity Check and fixing the errors? No, they would just switch from damaged Primary to (healthy) Backup Server. And what if Backup Server is damaged as well? As they reside on separated disk groups, this scenario means real disaster with the best solution: to switch to Async Backup which resides at another data center.

Thanks, Larry,
Adding your variant to the "code base".
As most of others, your solution needs *  length correction  * which I'm putting aside from Xecute to improve readability.

lpad(number=1,length=4)
  new code,i,z,sign
  set number=+number
  set sign="" set:number<0 sign="-",number=-number
  set code($increment(code))="w sign_$tr($j(number,length),"" "",""0"")"
  set:length<$l(number) length=$l( number) ;* length correction *
  set code($increment(code))="w sign_$e(1E"_length_"+number,2,*)"
  set code($increment(code))="w sign_$e(10**length+number,2,*)"
  set code($increment(code))="w sign_$e($tr($j("""",length),"" "",0)_number,*-length+1,*)"
  set code($increment(code))="s $P(z,""0"",length)=number w sign_$E(z,*-(length-1),*)"
  for i=1:1:code write code(i),"  => ",?60 xecute code(i) write !
  quit

If we are looking for more generic solution, e.g. pad <number> by zeroes yielding to the field of given <length>, let's try:

lpad(number=1,length=4)
  new code
  set code($increment(code))="w $tr($j(number,length),"" "",""0"")"
  set code($increment(code))="w $e(1E"_length_"+number,2,*)"
  set code($increment(code))="w $e(10**length+number,2,*)"
  set code($increment(code))="w $e($tr($j("""",length),"" "",0)_number,*-length+1,*)"
  for i=1:1:code write code(i)," => " xecute code(i) write !
  quit

Some  results are:

for n=999,9999,99999 d lpad^ztest(n,4) w !
 
w $tr($j(number,length)," ","0") => 0999
w $e(1E4+number,2,*) => 0999
w $e(10**length+number,2,*) => 0999
w $e($tr($j("",length)," ",0)_number,*-length+1,*) => 0999
 
w $tr($j(number,length)," ","0") => 9999
w $e(1E4+number,2,*) => 9999
w $e(10**length+number,2,*) => 9999
w $e($tr($j("",length)," ",0)_number,*-length+1,*) => 9999
 
w $tr($j(number,length)," ","0") => 99999
w $e(1E4+number,2,*) => 09999
w $e(10**length+number,2,*) => 09999
w $e($tr($j("",length)," ",0)_number,*-length+1,*) => 9999

Only the first solution (John's one) provides valid result even with "bad" input data ($length(99999) > 4). Others can't be amended this way without extra efforts irrelevant to this tiny task. Just to complete it:

lpad(number=1,length=4)
 new code,i
 set code($i(code))="w $tr($j(number,length),"" "",""0"")"
 set code($i(code))="w $e(1E"_$s(length>$l(number):length,1:$l(number))_"+number,2,*)"
 set code($i(code))="w $e(10**$s(length>$l(number):length,1:$l(number))+number,2,*)"
 set code($i(code))="w $e($tr($j("""",$s(length>$l(number):length,1:$l(number))),"" "",0)_number,*-$s(length>$l(number):length,1:$l(number))+1,*)"
 for i=1:1:code write code(i)," => ",?60 xecute code(i) write !
 quit

Now all solutions become correct even with the <number> >= 99999, but only the first one keeps its simplicity.

Hi Pete,
we have implemented the model that's very similar to yours with slight differences:

  • we use it at home only, where we have several development and testing Caché instances;
  • most of them are not connected using Mirroring and/or ECP;
  • all Caché users are LDAP users, so we need not bother of creating/modifying users on per instance basis;
  • one instance is used as a repository of roles definition (so called Roles Repository); it "knows" about each role that should be defined on each instance; this repository is wrapped with a REST service;
  • each role has a "standard" name, e.g. roleDEV, roleUSER, roleADM; these names are used in LDAP users' definition and retrieved during LDAP authentication process;
  • the resources lists for each standard role are stored in the Roles Repository; those lists are associated with instance addresses (server+port), so each role can be differently defined for different Caché instances; therefore, a user with the role roleDEV  can have different privileges on different servers;
  • each Caché instance queries the Roles Repository at startup; after getting the current definitions of its roles, it applies it to its Caché Security database.

This solution is used in our company's Dev & Testing environment more than a year without great problems. It is rather flexible and doesn't depend on proprietary transport protocols. The only drawbacks found are:

  • the Roles Repository is automatically queried at Caché startup only, so if something in role(s) definition(s) should be changed on fly, manual querying is needed;
  • sometimes InterSystems introduces new security caveats with the new versions of its products; one of them was a subject of an article here: https://community.intersystems.com/post/implicit-privileges-developer-ro....

Dear colleagues,

Thank you for paying so much attention to this tiny question. Maybe it was too tiny formulated: I should mention that objects instantiation impact is beyond the scope of the question as all of them are instantiated once; correspondent OREFs are stored in global scope variables for "public" use.

Going deep inside with %SYS.MONLBL is possible, while I'm too lazy to do it having no real performance problem. So, I've wrote several dummy methods, doubling instance and class ones, with different numbers of formal arguments, from 0 to 10. Here is the code I managed to write.

Class Scratch.test Extends %Library.RegisteredObject [ ProcedureBlock ]
ClassMethod dummyClassNull() As %String
{
  1
}

Method dummyInstNull() As %String
{
  1
}

ClassMethod dummyClass5(a1, a2, a3, a4, a5) As %String
{
  1
}

Method dummyInst5(a1, a2, a3, a4, a5) As %String
{
  1
}

ClassMethod dummyClass10(a1, a2, a3, a4, a5, a6, a7, a8, a9, a10) As %String
{
  1
}

Method dummyInst10(a1, a2, a3, a4, a5, a6, a7, a8, a9, a10) As %String
{
  1
}

}

My testing routine was:

ClassVsInst
   p1="пропоывшыщзшвыщшв"
   p2="гшщыгвыовлдыовдьыовдлоыдлв"
   p3="widuiowudoiwudoiwudoiwud"
   p4="прпроыпворыпворыпворыпв"
   p5="uywyiusywisywzxbabzjhagjЭ"
   p6="пропоывшыщзшвыщшв"
   p7="гшщыгвыовлдыовдьыовдлоыдлв"
   p8="widuiowudoiwudoiwudoiwud"
   p9="прпроыпворыпворыпворыпв"
   p10="uywyiusywisywzxbabzjhagjЭ"
   run^zmawr("s sc=##class(Scratch.test).dummyClass10(p1,p2,p3,p4,p5,p6,p7,p8,p9,p10)",1000000,"dummyClass10 "_$p($zv,"(Build"))
   st st=##class(Scratch.test).%New() run^zmawr("s sc=st.dummyInst10(p1,p2,p3,p4,p5,p6,p7,p8,p9,p10)",1000000,"dummyInst10 "_$p($zv,"(Build"))
   st=""
   run^zmawr("s sc=##class(Scratch.test).dummyClass5(p1,p2,p3,p4,p5)",1000000,"dummyClass5 "_$p($zv,"(Build"))
   st st=##class(Scratch.test).%New() run^zmawr("s sc=st.dummyInst5(p1,p2,p3,p4,p5)",1000000,"dummyInst5 "_$p($zv,"(Build"))
   st=""
   run^zmawr("s sc=##class(Scratch.test).dummyClassNull()",1000000,"dummyClassNull "_$p($zv,"(Build"))
   st st=##class(Scratch.test).%New() run^zmawr("s sc=st.dummyInstNull()",1000000,"dummyInstNull "_$p($zv,"(Build"))
   q

run(what, n, comment) ; execute line 'what' 'n' times
   n=$g(n,1)
   comment=$g(comment,"********** "_what_" "_n_" run(s) **********")
   comment,!
   zzh0=$zh
   i=1:1:what
   zzdt=$zh-zzh0 "total time = "_zzdt_" avg time = "_(zzdt/n),!
   q

The results were:

USER>d ClassVsInst^zmawr
dummyClass10 Cache for Windows (x86-64) 2017.2.2
total time = .377751 avg time = .000000377751
dummyInst10 Cache for Windows (x86-64) 2017.2.2
total time = .338336 avg time = .000000338336
dummyClass5 Cache for Windows (x86-64) 2017.2.2
total time = .335734 avg time = .000000335734
dummyInst5 Cache for Windows (x86-64) 2017.2.2
total time = .280145 avg time = .000000280145
dummyClassNull Cache for Windows (x86-64) 2017.2.2
total time = .256858 avg time = .000000256858
dummyInstNull Cache for Windows (x86-64) 2017.2.2
total time = .225813 avg time = .000000225813

So, despite my expectations, oref.Method() call turned to be quicker than its ##class(myClass).myMethod() analogue. As there is only less than microsecond effect per call, I don't see any reason for refactoring.

Hi Mark,

Several years ago we faced the similar problem. External system needed to pull new requests from our Caché DB and to push back the responses. We maintained a transition table in Caché where we placed new requests, external system polled the table each N seconds fetching the requests from the table and placing the responses back. Communication was implemented via ODBC.

You can do something like this filling transition table on remote Caché side using triggers associated with the "main" table.

If you prefer to code global size calculation by yourself rather than amend ^%GSIZE, the feasible option is to call

set bSize=$$AllocatedSize^%GSIZE(global)

which returns size in bytes for a global mapped to the current namespace. It recognizes the database the global is mapped from, so you don't need to do it yourself. The only thing you need is a global list for the namespace, which can be fetched in several ways, e.g. using $Order(^$G(global)). It can be used on per database basis as well. Pros of this approach:
- speed, as it neither runs query nor instantiates %SYS.GlobalQuery objects;
- AFAIR, there was an error in global size calculation with %SYS.GlobalQuery::Size() query in old Caché versions, up to 2015.1;
- starting from 2015.1, it can be used with subglobals.

Cons:
- this $$-function is not documented;
- not sure if it existed in 2010.1.