It seems that the source of the problem is a method of [re]starting Cache.

When it is started from shell using `ccontrol start ` command, SuperServer (as well as its childs) recognizes TZ that's actual system-wide.
But when it is started using a service script `service ca_cache start`, TZ is not recognized. There is nothing special in my script, its start() function is implemented as a wrapper for just the same `ccontrol start ` command as in the first case. It seems that service scripts are started from some special environment where some environment variables are deliberately unset.

I fixed the issue by setting the correct TZ in /etc/environment file and including one line of code into the service script:

start() {
        echo "Starting ca_$prog:"

        [ -f /etc/environment ] && . /etc/environment && export TZ

        ccontrol start $prog quietly
 

After 
# su cacheusr -
$ csession cache
QMS> w $$tz^ztest()
tap01.sparm.com /dev/pts/2 uid=502(cacheusr) gid=503(cacheusr) groups=503(cacheusr) tz= $ztz 0.0000008660

So the user type (=cacheusr) rather than process type seems to be a special case. After setting TZ manually I'm getting an expected responce:
tap01.sparm.com /dev/pts/2 uid=502(cacheusr) gid=503(cacheusr) groups=503(cacheusr) tz=Europe/Moscow $ztz 0.0000001271

but why cacheusr is not getting an environment from any standard place?

It's easy to set TZ for a Linux/UNIX tty user, but what about an app that is running some flavor of client/server mode? In this case Cache process inherits its environment from a special kind of parent, usually from SuperServer^%SYS.SERVER. 
At the moment I have no idea how to set the environment for SuperServer. I've tried:
1) setting system-wide using /etc/profile.d/*.sh
2) setting system-wide using /etc/environment
3) setting for cacheusr user using his .bash_profile.
Running a sample below, I'm getting a nice picture on /dev/pts and opposite one on |TCP|1972. The results are added in comments. I used CacheActiveX.dll (%Service_Bindings) for client/server connection.

tz(fun)
 if $zversion["UNIX" {
   set f="echo $TZ" open f:("QR"):1 if '$test write 0 quit "" use read tz close f
   set f="id" open f:("QR"):1 if '$test write 0 quit "" use read id close f
else {
   set tz="", id=""
 }
 set fun=$get(fun,"$ztz") ;"$h"
 set top=1000000 set ts=$zhorolog for i=1:1:top set @("d="_fun) set res=$zutil(110)_" "_$principal_" "_id_" tz="_tz_" "_fun_" "_$fnumber($zhorolog-ts/top,"",10) ;d $zf(-1,"echo $TZ") 
 quit res
 ;
 ; tap01.sparm.com |TCP|1972|1311 uid=502(cacheusr) gid=503(cacheusr) groups=503(cacheusr) tz= $ztz 0.0000011239
 ; tap01.sparm.com /dev/pts/0 uid=504(alex) gid=503(cacheusr) groups=503(cacheusr),10(wheel),505(alex) tz=Europe/Moscow $ztz 0.0000001212
 ;

Hello Mark,

As I had no idea which date and time functions can be affected with TZ setting, I'd tested some of them using a codelet:

top=1000000 fun="$h","$ztz","$zts","$zh" {ts=$zh i=1:1:top @("d="_fun) fun,?8,$fn($zh-ts/top,"",10),! $zf(-1,"echo $TZ")

My testing environment was:

%SYS>w $zv
Cache for UNIX (Red Hat Enterprise Linux for x86-64) 2015.1.2 (Build 607_0_15223) Thu Jul 16 2015 17:33:31 EDT
%SYS>!cat /etc/centos-release
CentOS release 6.6 (Final)

The results w/o TZ:

$h      0.0000023653
$ztz    0.0000009993
$zts    0.0000004639
$zh     0.0000003155

and with TZ:

$h      0.0000011856
$ztz    0.0000001379
$zts    0.0000004690
$zh     0.0000003189
Europe/Moscow

I.e., the "]"-operator returns 1 if the first string operand collates after the second string operand.​

That's true for String type collations only (e.g. Cache String, Cyrillic2 String, etc). For most other collations defined in Cache (traditionally called Numeric collations) which comply the rule "numbers go first" (e.g. Cache Standard, Cyrillic2) one should use "]]" as "collate after" operator. In common, "]]" seems to be the better choice as it uses current collation rather than String one. This small sample code can demonstrate the difference between "]" and "]]": 

num
 new b,a,i,i0 
 set b("12345678901234567870")="Num20" set b("12345678901234567874")="NotNum20"
 set a=1E30,b(+a)=1,b(+a_"11")=11,b(+a_"111")=111,b(+a_"1a")="3a",a=3E30,b(+a)=3,b(+a_"33")=33,b(+a_"333")=333,b(+a_"1a")="4a"
 set i="",i0="" for  set i=$order(b(i)) quit:i=""  write "(i]]i0)=",i]]i0," (i]i0)=",i]i0," (i>i0)=",i>i0," (i=+i)=",i=+i,?30," " zwrite b(i) set i0=i
 quit

The result for Cache Standard collation is:  

LEARN>d num^ztest
(i]]i0)=1 (i]i0)=1 (i>i0)=1 (i=+i)=1 b(12345678901234567870)="Num20"
(i]]i0)=1 (i]i0)=1 (i>i0)=0 (i=+i)=0 b("12345678901234567874")="NotNum20"
(i]]i0)=1 (i]i0)=0 (i>i0)=1 (i=+i)=1 b(1000000000000000000000000000000)=1
(i]]i0)=1 (i]i0)=1 (i>i0)=1 (i=+i)=1 b(3000000000000000000000000000000)=3
(i]]i0)=1 (i]i0)=0 (i>i0)=1 (i=+i)=0 b("100000000000000000000000000000011")=11
(i]]i0)=1 (i]i0)=1 (i>i0)=1 (i=+i)=0 b("300000000000000000000000000000033")=33
(i]]i0)=1 (i]i0)=0 (i>i0)=1 (i=+i)=0 b("1000000000000000000000000000000111")=111
(i]]i0)=1 (i]i0)=1 (i>i0)=1 (i=+i)=0 b("3000000000000000000000000000000333")=333
(i]]i0)=1 (i]i0)=0 (i>i0)=0 (i=+i)=0 b("10000000000000000000000000000001a")="3a"
(i]]i0)=1 (i]i0)=1 (i>i0)=1 (i=+i)=0 b("30000000000000000000000000000001a")="4a"

While most system administrators may never need or use this function, some employ it for certain kinds of maintenance or other special cases.

Hello Ray, 

I just tried to imagine what kinds of maintenance are better to run with the stopped mirroring:

  • Cache version upgrade: just stop Cache - not need to stop mirroring;
  • integrity check: maybe, as one may want to have database in stale state to avoid "false positives";
  • performing database backup: not sure, as mirrored database(s) would apparently become too late behind the primary one(s) when backup is finished;
  • Have I missed something important? Being a consultant and a trainer, I eager to take in account as many special cases as possible...

    Thank you,

    Alex

It seems that George James Software (like some other M houses) followed MSM traditions where all command and function users' extensions names could started with 'ZZ' only.

We at SP.ARM have some command extensions as well, and all of them are named as 'ZZ*'. E.g., ZZU - CHUI based namespace navigator, which was originally written for MSM many years ago (when 'U' meant 'UCI') and ported to Cache among first handy tools we used those days (y2k +/-1).

This old ZZ-rule was rather convenient as it naturally separated users' extensions (ZZ*) from system ones (Z<not Z>*).

Thank you for excellent articles, Murray.

We use slightly different approach for memory planning.

Our app mostly runs as a set of concurrent user sessions, one process per user. It's known that avg memory per process is 10Mb, we multiply it by 3*N_concurrent_users. The 1st multiplier (3) makes a gap for memory spikes. So, the result is a memory we leave for user processes.

We try to leave for Routine Buffer cache as much memory as possible, upto 1Gb.

The Global Buffer memory is usually calculated as a 30% of 3-years-old-database size for given kind of customer. Usually it comes to 24-64Gb global cache for medium to large size hospitals and provides thousands (or dozens of thousands) Rdratio. At whole, we usually get numbers that are close to your 60/40 proportion, while my globuff calculation method is not so presized as yours and I feel that I need a better calculation base for it.

I'd prefer not to use %ALL at all for a couple of reasons:

- if the mapping is created programmatically, it is not a problem to create it for each namespace,

- if it's created manually, one day you may forget about it and loose some important data or some other usefull things stored in non-default DB for years... of course, all this stuff should be documented and re-checked, but for me it is easier to do it once (for my nsp mappings) than twice (for my and for %ALL).

Hi Mark,

May I ask you to clarify a bit.
Am I right thinking that journal transfer is acknowledged only when it runs between failover nodes, so network latency is not a great problem for async communications? Of course, the bigger latency the bigger journal transfer delay, but it would not slow down the primary node operation.
E.g., if we have average 10-20 JrnWrts per second during each busy hour (with spikes up to 300), for right async WAN planning I should take in account that the latency <= 50 ms (1000/20=50) should be sufficient if we can live with DR async node that is usually about several seconds behind the primary?

As I can remind, shadowing allows cascading. Don't you think that cascading can be a good option for deploying of several async nodes at one (long distance) location as well?

Thanks,
Alex
 

Thank you for the article, Mark. I have a couple of questions on your "Option 3: Geographically Dispersed Deployments".

You placed two DR acync nodes at Data Center B. So, the data flow from Primary (at Data Center A) to DR async (at Data Center B) will be doubled. In general, WAN is not too quick, so it may increase latency. More important that mirrored databases at both DR async nodes may have different states (due to different latency), and it would be difficult to decide which of them should be promoted as a new Primary.
And how to catch up the database at the new Secondary, if we can't guarantee that it is in the same state as at the new Primary?

Well-well, Ed. The intention of following check up:

if zzzzzzZ'["zzzz" 

was to avoid logging my own variables which names were deliberately started with this nasty prefix. I agree that it's rather naive trick, but it worked.
As to original code, it was taken from the existing code base and complied our internal coding standards; I preferred to publish it "as is", having neither time nor will to re-test it.

Somewhen I used the following function to save stacked calls and all variables defined. Variables are not separated by stack levels, that seems to be the reason of $$getst() quickness.

 getst(zzzzgetvars,zzzzStBeg,zzzztemp) ; Save call stack in local or global array
 ; In:
 ; zzzzgetvars = 1 - save variables defined at the last stack level
 ; zzzzgetvars = 0 or omitted - don't save; default = 0
 ; zzzzStBeg - starting stack level for save; default: 1
 ; zzzztemp - where to save ($name).
 ; Out:
 ; zzzztemp    - number of stack levels saved 
 ; zzzztemp(1) - call at zzzzStBeg level
 ; zzzztemp(2) - call at zzzzStBeg+1 level
 ; ...
 ; zzzztemp(zzzztemp) - call at zzzzStBeg+zzzztemp-1 level
 ;
 ; Calls are saved in format:
 ; label+offset^rouname +CommandNumberInsideCodeLine~CodeLine w/o leading spaces"
 ; E.g.:
 ; zzzztemp(3) = First+2^%ZUtil +3~i x=""1stAarg"" s x=x+1 s c=$$Null(x,y)"
 ; Sample calls:
 ; d getst^%ZUtil(0,1,$name(temp)) ; save calls w/o variables in temp starting from level 1
 ; d getst^%ZUtil(1,4,$name(^zerr($i(^zerr)))) ; save calls with variables in ^zerr starting from level 4
 zzzzloop,zzzzzzZ,zzzzStEnd
 zzzzgetvars=$g(zzzzgetvars),zzzzStBeg=$g(zzzzStBeg,1) @zzzztemp @zzzztemp=0 zzzzStEnd=$STACK(-1)-2
 for zzzzloop=zzzzStBeg:1:zzzzStEnd @zzzztemp=@zzzztemp+1,@zzzztemp@(@zzzztemp)=$STACK(zzzzloop,"PLACE")_"~"_$zstrip($STACK(zzzzloop,"MCODE"),"<W") zzzzgetvars,(zzzzloop=zzzzStEnd) d
 . zzzzzzZ="" for  zzzzzzZ=$o(@zzzzzzZ) q:zzzzzzZ=""  if zzzzzzZ'["zzzz" @zzzztemp@(@zzzztemp,zzzzzzZ)=$g(@zzzzzzZ)
 . $ze'="" @zzzztemp@(@zzzztemp,"$ze")=$ze
 1

Just adding my 2c to "4. Tip For UNIX sites":
All necessary unix/linux packages should be installed before the first invocation of 

Do run^pButtons(<any profile>)

otherwise some commands may be missed in ^pButtons("cmds"). I've recently faced it at the server where sysstat wasn't installed: `sar -d` and `sar -u` commands were absent. If you decide to install it later (`sudo yum install sysstat`in my case), ^pButtons("cmds") would not be automatically updated without little help from you: just kill it before calling the run^pButtons().

This is actual at least for pButtons v.1.15c-1.16c and v.5 (which recently occurred on ftp://ftp.intersys.com/pub/performance/), in Caché 2015.1.2.