You may also have a look at The Universal MUMPS(M) commander (Alt-NC) (http://minimdb.com/tools/altnc412.html).
It is similar to other commanders well known outside of Caché/M communities, such as Midnight Commander (mc, Linux), File and Archive Manager (far.exe, MS Windows), and of course their predecessor famous Norton Commander (nc, MS DOS) in many ways. Full screen editor which is included into commander is very simple and easy to learn. It is not so sophisticated as great programmers' tools of the past (such as DEC's EDT), but a bit smarter than notepad (MS Windows) as it is full functional without a mouse.

Pros: it runs from a command line, it does't force any extra tcp port to be opened, it supports as much terminal types as OS does, it has a non-% version so you don't need CACHESYS:RW privilege to load the code.

Cons: the screen design is pretty ancient, but should we expect too much from CHUI application which roots come from early 1990x?

Hi Murray,
thank you for continuing your series.

Don't you think that VM image backup (despite of its importance) has a drawback as it may contain pretty huge amount of data that is unnecessary for simple database restoration? E.g., VM image may contain hundreds of gigabytes of journals useless for the database state in backup file. IMHO, in this case a kind of selective backup can be attractive. Not aware of Veeam, but sure that Acronis can do it on file system level. I wonder if selective external backup (e.g., in the case of Veeam) can be integrated with Cache DB freeze/thaw features with the same ease as a full one?

c. Map the relevant globals to a non-journaled database

Sometimes it's being done just to conceal some typical app level drawbacks (such as missing the cases when  temporary globals can be used, (excessive) looped rewrites of persistent data, etc) although it may lead to more serious administrative level drawbacks.

Normally all app's globals (whether they are journaled or not) should by coordinated with each other; if a global should not (or may not), it is a good candidate to be mapped to CACHETEMP DB. Any fail-over scenario you may imagine includes a step of journal restoration. Believe me, it can be a great problem for admin to decide what to do with each non-journaled global after the server fault: which one can be fully (or partially) KILLed, and which one need to be ^REPAIRed. Mostly the right decision is impossible without the developer's intervention. So, the simple dev level solution can introduce much more serious admin level complications.

IMHO, a global should be considered of one of two types basing on its content type:
1. "normal" persistent data: such global should be placed in journaled DB without any exceptions;
2. temporary data which is need only during process run: map it to CACHETEMP or make it private ^||global.

Sometimes I was told by developers that a third type exists which comprises some pre-generated data that is stored to improve performance of some heavy jobs (e.g. reporting). The pre-generation process can be time (and resource) consuming, so it looks like that the best place for this 3d type globals is a non-journaled DB. After several database reparations I'd answer "No!". Depending on pre-generation process details, each of these globals can (and should) be put in one of two categories taking in account that reboots of modern servers are relatively rare events:
- if it's not too hard to regenerate the data: just automate it in your code and map the global to CACHETEMP;
- if not, consider your global as operational one and place it into journaled DB; to reduce excessive journaling during its generation, just choose between approaches "e" or "d" of Tani's article.

Hi Murray,
Speaking of ECP, we usually imagine distributed data processing on several app servers. But what about distributed databases? E.g., can the solution to split the databases among several data servers just to distribute heavy journal and/or write daemon load be smart in some cases?
I see some drawbacks of this solution:
#1 AFAIK, there is no distributed transactions support in Caché.
#2 To couple it with mirroring, one should deploy N mirrors, where N is the number of (primary) data servers; having no option of "coherent mirroring", their N backup members can have different latency against their primaries, so (baring in mind #1) switching mirror members can have worse consequences as in traditional case of only one (primary) data server.

Have I missed something? Maybe you've seen some field cases when distributing databases looked smart, haven't you?

Thank you,

Presumably, it's a security issue. Check effective UID and GID of your Caché processes. To do it, you may check parameters.isc file from Caché install directory for lines like these: 

security_settings.cache_user: cacheusr
security_settings.cache_group: cacheusr

Unlikely user cacheusr has access rights to other user's home directory.

csession processes are the exception from others as they inherit calling user's UID.

IMHO, it's better to use some neutral folder for file exchange, e.g. "/tmp/myexchange", as in this case it's much easier to establish appropriate assess rights for each side involved in exchange.

P.S. UNIX® Users, Groups and Permissions stuff is well-documented, see:  http://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY=...

Eugene mentions OMI and ECP as traditional interconnection kinds of transport. None of them is quite secure, one may check docs for proof (as to ECP: http://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY=... ).

ECP is defined as a basic service in contrast to resource based ones. Basic services can't efficiently use resource/roles/users security model by design. It's understood as their layer is usually too low to apply it.

Eugene's services can be dealt with the same precautions as other basic services, i.e. inside secured perimeter only. Port 500x can (and should) be closed on external firewall.

Hi all,
Setting a startup routine like this can't lock non-terminal based Cache functionality (CSP, etc) as the routine is started on terminal kind of logins only, while I agree that LOGIN^%ZSTART should be sufficient in most cases. IMHO, this per user settings of startup routine in SMP comes from "the good old days" of terminal oriented apps which were often tied to terminals to prevent end users from entering MUMPS commands.
=
Cheers,
Alex

Are there any plans to introduce in 2017.1 a feature of Quick Old Primary Switching Back which seems to be of great importance for the scenario of temporary move of the Primary role to Promoted Async DR? It is known as Prodlog 142446. In a few words:

After the promotion of DR Async without partners check, the old Primary / Backup members functionality would be likely restored only after rebuilt. Copying the ~1TB backup using long distance link can take many hours or even days, so it will be nice to track a point when the databases were last time "in sync" (while I'm not sure if this term can be used in the case of DR async). After that:
- discard the SETs/KILLs that could be made on old primary after this point
- demote (?) it to be a new DR async
- having the most recent journal on this new DR async, we can promote it to be a very new primary (returning it its "native" role).

Thank you, Bob, you mostly answered my questions.

you would compare the name and date of the most recent journal file from the async to the most recent journal file on the DR you are going to promote to see if you can get more recent journal data, which may not be the most recent.

At the meantime we are (internally) discussing the worst case of complete isolation of main Data Centre, so both Member A and B can be not available. In this case the only thing we can do is to check if the DR Async we are going to promote has the more recent journal data among all other available Asyncs, right?

The version of Caché is 2015.1.4 for Windows x64, on Core i5 based laptop. When I have a time gap, I'd re-run this test on more powerful server, while I don't expect a noticeable difference.

Let's try to estimate time for both operations:
1: set max(args(i))=i ~ time_to_find_args(i)_position_in_max_array + time_to_allocate_memory_for_max(args(i)) ~  O(ln(length(args(i))) + O(length(args(i)+length(i))) ~ O(3*ln(length(args(i)))
2: max<args(i)  ~ time_to_compare_max_and_args(i) ~ O(ln(length(args(i))))

So it seems that 2 should be ~3 times quicker than 1, but we don't know real coefficients which stand behind those O() estimations. I should confess that local array node allocation penalty turned to be higher than I expected.

This speed difference should be even more would args(i) values be strings rather than numbers.

Bob,

I have a couple of questions on DR Async.

1) There is an option of DR Promotion and Manual Failover with Journal Data from Journal Files
( http://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY=... )
where one is advised to get journal files from one of failover members.
If both failover members are not available, is it possible to get journal files from another (DR or reporting) Async member?
If so, what preliminary configuration steps one should proceed on that member to allow this option in the case of disaster?


2) Another question is on journal files collection as well. You wrote: 

Asyncs receive journal data from the primary asynchronously, and as a result may sometimes be a few journal records behind

Is it true that Asyncs pulls journal data from Primary? If so, Primary is not aware of whether data was pulled by Async or not. Therefore, if Async was not getting journal data from Primary for several days (e.g. due to communication problems), the next unread journal file can be already purged from Primary's local storage.

Is it possible to recover from this situation without rebuilding Async? E.g., if the purged journals are available as a part of file system level backup, or those files are kept on another Async server, can it help?

==
Thanks...