Thank you, Timothy.
Definitely this is a solution, while our case is a bit more complex. It can be several running copies of "dangerous" utility started by different users, so we can't select the actual one neither by executable name nor by its window title as it is started with "-nogui" option and has got no window.

An approach how to bypass this limitation that we are going to implement looks like this:

 lock +^offPID
 set rc=$zf(-2,"drive:\path\utility --par1 --parN")
 if rc=0 {
   get PIDs of all running copies of utility.exe
   if '$data(^offPID(PID1)) {
       set ^offPID(PID1)=$h // search a new PID (let it be PID1)
       job checkJob(PID1)
   }
 } else {
   process an error
 }
 lock -^offPID
 ...
 ...

сheckJob(pPID) // check if pPID is running
  for {
    get PIDs of all running copies of utility.exe
    if pPID is not listed {
       kill ^offPID(pPID)
       quit
     
   } elseif timeout expired {
       set rc=$zf(-1,"taskkill /pid "_pPID)
       if rc=0 { ... }
       else { process an error }
       kill ^offPID(pPID)
       quit

   } else {
      hang 30 // wait...
   }
}
quit

Thank you, Robert.

In this case the PID saved in a file will be the PID of command processor (cmd) itself rather than the PID of an utility which it has invoked. If Cache process kill the cmd instance using this PID, the utility will continue its execution. Besides, there is no  parent-child relationship between the cmd and the utility, so /t switch (kill process tree) would not help.

Thank you, Stephen.

I should confess that you are quite write: our code resides in .cls, .inc, .int and also in globals. As to my straightforward code, it is not written yet. At the moment I have a simple "search-in-global engine", which matching ability should be improved (regexp?) and replacement functionality should be added. If speed will turn to be a problem, I will use it with ^ROUTINE, ^rINC and ^oddDEF globals as well, while I dislike this idea and prefer to start with official API (%Dictionary classes). 

If +value=value

This classic code is good for checking if the value is canonical number, while the term number can be interpreted in some other ways, e.g. a number in scientific format, double precision number, etc. Caché has a set of out-of-the-box functions to perform some of these checks, e.g. $isvalidnum, $isvaliddouble, $number, $normalize.

So, the answer depends on topic starter's conditions.

E.g., if I need to check if a number is a numlit (scientific numbers and numbers with starting zeroes are allowed), I'd use `$isvalidnum(x)`. Addition check on being integer (an intlit) can look like: `$isvalidnum(x)&&(+x\1=+x)`. Here are some testing results: 

USER> w !,?6,"Is number?",?20,"Is integer?",?35,"Is canonic?"
USER> for x="001a","002",0,1,"-1.2","+1.3","1E3" w !,x,?10," ",$isvalidnum(x),?20," ",$isvalidnum(x)&&(+x\1=+x),?35,x=+x
 
      Is number?    Is integer?    Is canonic?
001a       0         0             0
002        1         1             0
0          1         1             1
1          1         1             1
-1.2       1         0             1
+1.3       1         0             0
1E3        1         1             0

In other conditions I'd write another code. There is no universal answer to topic starter's question.

P.S. As to "Annotated MUMPS Standard" (http://71.174.62.16/Demo/AnnoStd):
An intlit is not necessarily a canonic representation of a number.

numlit is not necessarily a canonic representation of a number.

The reduction to a canonical numeric representation involves (colloquially) the removal of any redundant leading and trailing zeroes, the conversion of exponentionential notation to "mantissa only" notation, and the reduction of any leading plus (+) and minus (-) signs to at most one leading minus sign (see also Numeric interpretation of data).

Via SSH (putty, etc), am I right that you need to call csession <instance> to enter the Caché terminal? If so, there is no talk about SSH all. 

Caché has got embedded libssh2.dll/.so ages ago. Why not implement internal SSH server which can be a reasonable replacement for outdated (and Windows only) telnet one? It seems that some other projects (besides Web terminal) would take advantage from it.

Our customers are to check DB integrity on regular basis, usually weekly, while I don't remember a case when it showed errors which were not evidient without it (<DATABASE> errors in error and console logs, etc).

Last time when I had an opportunity to use ^REPAIR was about 1.5 year ago, when our support specialist defragmented free space in a database under Caché 2015.1.2. The bulletin from InterSystems about the possibility of defragmentation errors arrived a bit later... Thanks to backup performed before the defragmentation, the opportunity to use ^REPAIR was closed that time :) After upgrade to 2015.1.4 no errors of such kind were detected in the field.

The faults of Integrity check are:
- when there is some concurrent users' activity it may provide false positives in per database summary report (Errors found in database...) while there are no real errors neither in database nor in per global report;
- (mostly about TASKMGR): there is no way to include into the task completion reports (which can be e-mailed) any information from the task, e.g., about errors found by Integrity. 

Sergey, you are doing a great job popularizing our (unfairly) niche technology!

Despite the article was published on private resource,  such a phrase: 

They were first introduced in 1966 in the M(UMPS) programming language (which later evolved into Caché ObjectScript, COS), which was initially used in medical databases.

sounds (at least for me) as a disrespect to many talented people who contributed to the technology in terms of many other M implementations. Some of them already gone...

Truth should sound like this: COS was developed by InterSystems as a superset of M (see http://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY=...). The  original language is still accepted by developers' community and alive in a few implementations. There are several signs of activity around the web: MUMPS Google group, user group (http://mumps.org/), effective ISO Standard (http://71.174.62.16/Demo/AnnoStd), etc.

90 and 100 lines for serializers is quite the achievement.

At the moment I have a serializer (<glvn> to json string) and a deserializer (json string to <glvn>) of 40-60 lines of pure M COS code. Should run on any Caché version, was tested on 2012.2 and higher. Some proof that they really exist:

USER>d $system.CPU.Dump()
 
-- CPU Info for node maslov --------------------------------------------------
          Architecture: x86
                 Model: Intel(R) Core(TM) i5-4460  CPU @ 3.20GHz
< ... >
USER>d j2gSmall^zmawr

JSON2G^Wmgr Cache for Windows (x86-64) 2017.1
total time = 1.05795 avg time = .0000105795

G2JSON^Wmgr Cache for Windows (x86-64) 2017.1
total time = 1.898275 avg time = .00001898275

sJson <=> arr()? yes ; the result of conversion reversibility check

USER>zw sJson
sJson="{""BirthDate"":""1970-03-25"",""FirstName"":""Sean"",""Hobbies"":[""Photography"",""Walking"",""Football""],""LastName"":""Connelly""}"
 
USER>zw arr
arr("BirthDate")="1970-03-25"
arr("FirstName")="Sean"
arr("Hobbies",0)="Photography"
arr("Hobbies",1)="Walking"
arr("Hobbies",2)="Football"
arr("LastName")="Connelly"

This code is not my own development (while I contributed a bit), so if anybody wants it to be published I should redirect this request to main contributor(s).

BTW: Sean, was this JSON string

TestRawJson = "{""TestAllAsciiChars"":"" !\""#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]"_$c(142,143,144,145,146,147,148,149,150,151,152,153,154,157,158,159)_"...

 from your sample wrong escaped deliberately?

I don't understand the difference between this two kinds of voting :) Which solution is the best, depends on many factors: if we need turbo performance, we'd take your approach, if not - %ResultSet based one. BTW, I guess that file dirs scanning is a small part of a bigger task and those files are processed after they have been searched and the processing takes much longer than the directory search.

Last 2c for cross-platform approach: the main place where COS developer faces problems is interfacing with 3d party software. As I was told by one German colleague, "Cache is great for seamless integration".

E.g., I've recently found that forcely resetting of LD_LIBRARY_PATH by Cache for Linux may cause problems for some utilities on some Linux versions. It's better to stop here, maybe I'll write about it separately.

I'm voting for Rubens's solution as it is OS independent. Caché as a great sandbox in many cases, why not use its "middleware" capabilities? Developer's time costs much more than CPU cycles, and every piece of OS dependent code should be written and debugged separately for each OS that should be supported.

As to performance, in this very case I doubt if the recursion costs much compared to system calls. Anyway, it's not a great problem to replace it with the iteration.

And what will happen, if one decides to revert his class def to a previous version with previous storage def state, having some data already populated using new schema?

It seems that there is no "good" choice between TS's two options, only some "bad" vs "worse", unless business logic is accurately separated from data and kept in different classes. In this case the probability of reverting data class def should be lower.