Alexey Maslov · Oct 28, 2016 go to post

Using a repository is a good idea for sure, but what about a solution that can help even if an 'intruder' had bypassed it and changed a class, e.g., on production server? Here is one which answers who changed SomeClassName.CLS; this code can be executed in "%SYS" namespace using System Management Portal/SQL:

SELECT DISTINCT TOP 1 s.UTCTimeStamp, s.OSUsername, s.Username, s.Description
FROM %SYS.Audit as s
WHERE s.Event='RoutineChange'
AND s.Description LIKE '%SomeClassName.cls%'
ORDER BY s.UTCTimeStamp desc

It's easy to adapt it for searching the same info for MAC, INT and INC routines.
Enjoy!

Alexey Maslov · Oct 26, 2016 go to post

Hi Anzelem,

May I ask you a couple of questions on your DR solution?

Which node would take over on Primary failure: Cache Mirror Backup or VCS secondary if both are alive?

More general: what is the main reason of mixing 2 different DR approaches?

=Thanks

Alexey Maslov · Oct 6, 2016 go to post

Sometimes such strange results are caused by ignoring the fact that usually there are several levels of caching, from high to low:

- Caché global cache

- filesystem cache (on Linux/UNIX only, as Windows version uses direct i/o)

- hdd controller cache.

So even restarting Caché can be not enough to drop the cache for clear "cold" testing. The tester should be aware of data volume involved, it should be much more than hdd controller cache (at least). mgstat can help to figure this out, besides it can show when you start reading data mostly from global cache rather than from filesystem/hdd.

Alexey Maslov · Oct 6, 2016 go to post

Hi Murray, thank you for keep writing very useful articles.

ECP is a rather complex stuff and it seems it does worth addition writing.

Just a quick comment to your point: 

For sustained throughput average write response time for journal sync must be:
<=0.5 ms with maximum of <=1 ms.

How can one distinguish journal syncs from other journal records looking at iostat log only? It seems that 0.5-1ms limit should be applied to  every journal write, not only to sync records.

And a couple of small questions. You wrote that
1) "...each SET or KILL the current journal buffer is written (or rewritten) to disk. " 
and
2) "On very busy systems journal syncs can be bundled or deferred into multiple sync requests in a single sync operation."
Having mgstat logs for a (non-ECP) system, is it possible to predict future journal syncs rate after scaling horizontally to ECP cluster? E.g., if we have average and peak mgstat Gloupds values, can we predict future journal syncs rate? What is the top rate of journal syncs when their bundling/deferring begins?

Alexey Maslov · Oct 3, 2016 go to post

However, the Newbie can ignore it all, by using Caché SQL

If so, how do you answer the curious Newbie's question: why should I use Caché at all, as a few SQL implementations are available for free nowadays?

Usually those questions were answered like this: Caché provides Unified Data Architecture that allows several access methods to the same data (bla-bla-bla), and the quickest of them is Direct Global Access. If we answer this way, we should teach how to traverse across the globals, so you are doing the very right and useful thing!
There is only one IMHO: semantics can be more difficult to catch than syntax. Whether one writes `while (1) { ... }` or `for { ... }`, it's basically all the same, while using $order or $query changes traverse algorithm a lot, and it seems that this stuff should be discussed in more details.

Alexey Maslov · Sep 29, 2016 go to post

Basic and advanced mode were in an old version of another tool named ^Buttons. With ^pButtons you have an option to reduce the number of OS commands being performed, as it was shown in Tip #4.

Alexey Maslov · Sep 28, 2016 go to post

Good Morning, William!
To trace logon errors you should be interested in events with Name  = LoginFailure

Alexey Maslov · Sep 28, 2016 go to post

It is hard to guess what a kind of problem you have without looking at Cache Security audit records of your logon attempts.

Alexey Maslov · Sep 26, 2016 go to post

Checking .LCK files is useless in most cases as Caché service auto-starts with OS startup. Of course, switching auto-start off is not a problem for development/testing environment.

Frank touched another interesting question: how long WaitToKillServiceTimeout should be? If we set it to ShutdownTimeout + Typical_Real_Shutdown_Time, and Caché hangs during OS shutdown, I bet that typical Windows admin won't wait 5 minutes and finish with hardware reset...  Choosing between bad and worse, I'd set
WaitToKillServiceTimeout = Typical_Real_Shutdown_Time
letting OS to force Caché down in rare cases when it hangs.

Alexey Maslov · Sep 26, 2016 go to post

As to documentation for Caché v. 2015.1, ShutdownTimeout parameter ranges from 120 to a maximum of 100,000 seconds with the default of 300 seconds. In my case its value is 120 seconds, but in the worst cases I've managed to find in my log shutdown performed faster, approx. 40 seconds, e.g.: 

05/18/16-18:51:45:817 (3728) 1 Operating System shutdown!  Cache performing fast shutdown.
05/18/16-18:52:27:302 (3728) 1 Forced 11 user processes.  They may not have completed.
05/18/16-18:52:27:302 (3728) 0 Fast shutdown complete
05/18/16-18:52:27:474 (3728) 0 CONTROL exited due to force
05/18/16-18:52:27:630 (3656) 0 JRNDMN exited due to force
05/18/16-18:52:27:614 (3560) 0 GARCOL exited due to force
05/18/16-18:52:27:802 (1064) 0 EXPDMN exited due to force
05/18/16-18:52:27:786 (3760) 0 No blocks pending in WIJ file
05/18/16-18:52:27:880 (3760) 0 WRTDMN exited due to force

while one can see word "force" in the log... It seems that OS shutdown is a special case of forcing Caché down, without waiting ShutdownTimeout seconds. I plan to adjust the registry value as suggested in this article and check what will happen on the next OS shutdown (when I decide to do it).

Alexey Maslov · Sep 26, 2016 go to post

Steve, 
> if I were to see this I would then check that everything closed nicely
How to do this check? Every Caché startup after its fast shutdown is corresponded with the following message in console log:

09/16/16-13:48:38:305 (2132) 2 Previous system shutdown was abnormal, system forced down or crashed

while there are no logged signs of "normal" forcing down (system tables dump, etc). Maybe this rule has exceptions, but I've never seen them.

==
Thank you,
Alex

Alexey Maslov · Sep 26, 2016 go to post

Mike,
I fully agree with you that newbies should be aware of dotted syntax and some other old idioms that can be met in legacy code, but it seems that they also need some judgment from "oldies": which coding constructions are more or less acceptable and which should be avoided by any means (like usage of $ZOrder and $ZNext functions).
P.S. (offtop) Could we meet each other in Bratislava in 1992?

Alexey Maslov · Sep 26, 2016 go to post

Stephen, thank you for the info.

May I ask you to clarify a little: should the console log messages like this one

08/29/16-09:41:03:376 (4864) 1 Operating System shutdown!  Cache performing fast shutdown.

be considered as sympthoms of this issue?

Alexey Maslov · Sep 24, 2016 go to post

We distribute a COS routine which runs after our application update is imported and does all the necessary job: checks if the TASKMGR's task already exists in schedule, and sets it up if not. Some tasks are the subject for manual setup as their settings may depend on local instance specific.

Alexey Maslov · Sep 8, 2016 go to post

Bob, I suggested to deploy reporting async just because Kevin mentioned that his boxes have different roles: one is running production system, another is being used for development. How it can fit Kevins's needs depends on several factors, one of which: whether or not his code and data reside in separated databases.

Alexey Maslov · Sep 8, 2016 go to post

Kevin, in addition to Bob's point:

you can deploy mirror comprises two members: #1 - primary and #2 - async (reporting/RW). Definetely it is not real DR solution, just a way to implement two potentially different systems working with two slightly different copies of the same database.

Alexey Maslov · Sep 8, 2016 go to post

Kevin,

If your licenses for both platforms (different in InterSystems license model) allow running Mirroring, the question would be 'yes'. AFAIK, you need at least Entree Multi-Server.

In most cases Cache programs don't depend on bitness of underlined platform, so you can run all of them in 64bit. The only cases when bitness comes to play occur when you are interacting with external applications, e.g. communicating through SQL GateWay with external DSN with 32bit ODBC driver when 64bit version of the driver is not available.

I'd ask another question: Kevin, did you ever try to configure and run Mirroring on Windows 10? Was it working?

I tried it a couple of months ago (Cache 2015.1.4 on Windows 10) and completely failed: mirror members could not communicate. Having enough experience in Mirroring deployment on Red Hat EL and MS Windows Server,  I was sure that it was a platform issue rather than my own. As it was not of great importance for me, I decided not to disturb WRC and dropped it.

Alexey Maslov · Sep 6, 2016 go to post

Dmitry,

Ricardo is right: DISABLE^%NOJRN (despite of its non intuitive name) disables the journaling for the current process. Documentation states this, besides it's easy to check:

USER>d DISABLE^%NOJRN w $$CURRENT^%NOJRN
0
USER>d ENABLE^%NOJRN w $$CURRENT^%NOJRN
1
USER>w $$DisableJournal^%SYS.NOJRN," ",$$CURRENT^%NOJRN
1 0
USER>d EnableJournal^%SYS.NOJRN w $$CURRENT^%NOJRN
1
Alexey Maslov · Sep 5, 2016 go to post

Timur, thank you for the series of articles!
It's clear enough that the purpose of your 1st sample was just to introduce map-reduce ideas, but besides you've illustrated concurrent processing technique available in Cache' which can be used apart of map-reducing. From this point of view, the parallel implementation of word count algorithm could be better balanced if the method MR.Sample.WordCount.AppWorkers::Map() would just count words emitting the result to infraPipe. In this case Reduce() method becomes trivial as all it needs to do - just summarize 4 (= number of book volumes) numbers from the infraPipe.

Alexey Maslov · Aug 12, 2016 go to post

 "How robust is your great OS",!
 "(Each of which has cons and pros),",!
 "But specific under stress?",!
 "Want to check it? Just key press:",! *key set OS=$system.Version.GetOS()
 w:OS="Windows" "Are you really its follower? press <RESET> to do failover!",!
 w:OS="UNIX" "Under *NIX you are OK: your job killed, others remain.",!
 w:OS="VMS" "Not aware of VMS, contribute somebody else!"
 ;for i=1:1 set a(i)=$j("",3*1024*1024) $j(i,4)
Alexey Maslov · Aug 2, 2016 go to post

#CachéLimerick

There was an old man in the vale
Whose DB was as fast as a snail.
After move to Caché
He could jump entrechat
And today he is far from his vale...

Alexey Maslov · Jul 18, 2016 go to post

Fabian, yes, it would be interesting to hear more on your approach.

Recently I faced the similar problem: we were asked for a tool to estimate a size of each global from the top N biggest ones. Our solution is to calculate the global sizes on regular basis (using a Cache Manager's Task) and to transfer the results to external SNMP server (using our own customized MIB). Visualization is provided by SNMP server (we and our customer use Zabbix).

As to global size calculation speed, in our case it takes about 30 minutes for 1TB database. Only allocated space is estimated.

Alexey Maslov · Jul 11, 2016 go to post

Not tested for speed, while I expect this version should be rather fast as it compares common parts of both references rather than individual suscripts. Enjoy!

tttcmp(fgname,tgname,bKill,nErrTotal,nErrTop) ; Compare [sub]array @fgname with [sub]array @tgname
;In:
; fgname - "original" [sub]array
; tgname - its copy to check with;
; bKill - kill @tgname if it matches to @fgname (default = 0)
; nErrTop - # of mismatches to find to stop comparison
;
;Out:
; returns 1 on full subscripts and data match, else - 0.
; ByRef nErrTotal - # of mismatches.
;
new x,y,xtop,ytop,i,flOK,flQ,xquit,yquit,nErr,xstart,ystart
set bKill=$get(bKill,0)
set nErrTop=$get(nErrTop,1)
set x=fgname,y=tgname write !,"Comparing original "_fgname_" with imported "_tgname_":"
set xstart=$length($name(@x,$qlength(x)))+$select($qlength(x):1,1:2)
set xtop=$select($qlength(x):$extract(x,1,$length(x)-1)_",",1:x)
set ystart=$length($name(@y,$qlength(y)))+$select($qlength(y):1,1:2)
set ytop=$select($qlength(y):$extract(y,1,$length(y)-1)_",",1:y)
set flOK=1,flQ=0,nErr=0,nErrTotal=0
for i=1:1 do  quit:flQ
. set x=$query(@x),xquit=x=""!(x'[xtop)
. set y=$query(@y),yquit=y=""!(y'[ytop)
. if xquit,yquit write " OK. i=",set flQ=1 quit
. if xquit!yquit write " NO!!!: i=",i,$select(xquit:" "_fgname_" is shorter than "_tgname,1:" "_tgname_" is shorter than "_fgname) set nErrTotal=nErrTotal+1,flOK=0,flQ=1 quit
. if $extract(x,xstart,$length(x))'=$extract(y,ystart,$length(y)) write !,"!!! Ref NEQ: i=",write !," x=",x,!," y=",y  set nErrTotal=nErrTotal+1,nErr=nErr+1,flOK=0 set:nErr'<nErrTop flQ=1 quit:flQ  ;!,$e(x,xstart,$l(x)),!,$e(y,ystart,$l(y)),
. if $get(@x)'=$get(@y) write !,"!!! Data NEQ: i=",write !," *** x = ",x,!," x => ",@x,!," *** y = ",y,!," @y => ",@set nErrTotal=nErrTotal+1,nErr=nErr+1,flOK=0 set:nErr'<nErrTop flQ=1 quit:flQ
. else  set nErr=0
if flOK,bKill write !,"Killing "_tgname_"..." kill @tgname
else  write !,"Not Killing "_tgname
quit flOK
Alexey Maslov · Jul 6, 2016 go to post

Murray, thank you for continuing the series.
A little question I have: does quicker journal response in ECP environment depend on whether Mirroring is used, or not?

Alexey Maslov · Jun 30, 2016 go to post

Dmitry, thanks for sharing this info. Fix a typo: here

/usr/share/bash-completion/completions/ccontrol

# bash completions for InterSystems csession

it should be: /usr/share/bash-completion/completions/csession

Alexey Maslov · Jun 23, 2016 go to post

Hello Mike,

Add to your list several books written in Russian and in German:

  • M СУБД  - 2013, by Eugine Karataev
  • Von ANS MUMPS zu ISO/M - Fortgeschrittene Programmierung in M  - 1993, by Wolfgang Kirsten
  • Einführung in die Programmiersprache MUMPS  - 1989, by Stephan Hesse and Wolfgang Kirsten.

=
Kind regards,
Alex