The biggest thing you want to do is use three-argument $order to collapse from two global references to one:  $ORDER(^[Nspace]LAB(PIDX),1,Data)

In regards to the question about setting BBData or other small variants like that, it may very much be data-dependent and depend on what happens later in the loop that you haven't showed us.  But generally speaking if you're going to calculate the $p more than once, you probably do want to store it in a (private) variable.

You can certainly combine multiple conditions with and and or operators (&& and ||) if that's what you're asking.  Also, constructs like $case and $select can help (in case you haven't encountered them before).

Yes, for sure.  A global node can have raw binary data as its value (and also as a subscript in most global collations, though length is substantially restricted in subscripts).  Also, $listbuild can have binary data as list elements. If you're storing it as a property of persistent class, you can use %Binary as the type.

Isn't the algorithm you describe going to lead to data discrepancies.  In particular, you have something like 1 in 2^32 chance of missing an update because it hashes to the same crc value.  Maybe this was already obvious to you and that it's okay for some reason, but thought I should say something just in case...  

Of course you could use a cryptographic hash function, like $system.Encryption.SHAHash(), but that takes substantial computation time, so you might not be any better off than you would be by actually opening the object and comparing the values directly.  It sounds like either way you're already resigned to traversing every object in the source database. (If the source database is growing then this entire approach won't work indefinitely of course)

Alex, I agree with you that I wouldn't recommend using this function for any of the use cases you mention. 

Laurel mentions one use case below, where you wish to preserve the state of a DR or backup before performing an application upgrade or data conversion so that it can be viable as a failback if something goes wrong.

Another case (which we mention in documentation) is if you are performing some maintenance activity on the primary host, particularly a virtual host, whereby you expect that it might interrupt connections to the backup and arbiter and you'd rather not have disconnects of failovers occur as a result.  This use case raises some questions, like why not just fail over to the backup before that maintenance, but we'll leave that aside. 

There's also the principle that it's good to have a way to shut things off temporarily without needing to dismantle the configuration or shut down the instance entirely.  That can be handy in troubleshooting.  

In the mirror monitor you would see the state of the member as Stopped, which exactly defined as stopped by an admin; not connected for other reasons is defined as a different state, like Waiting or Crashed, and this is no change.   With this change, we would add a cconsole message when we skip starting mirroring on instance startup start due to it being stopped by an admin. 

At a fundamental level the worry that you attribute to ObjectScript is not really particular to ObjectScript or any other language, but rather an issue of parallel vs serial processing.  The fundamental issue you're raising here is that when programming at the level of individual database accesses ($order or random gets or whatever) one process is in a loop doing a single database operation, performing some (perhaps minimal) computation, and then doing another database operation.  Some of those database operations may require a disk read, but, especially in the $order case, many will not because the block is already cached.  When it does need to do a disk read, the process is not doing computation because, well, this is all serial; the next computation depends on what will be read.  Imagine the CPU portion of the loop could be magically minimized to zero; even then this process could only keep a single disk busy at a time.  However, the disk array you're talking about achieves 50,000 IOPS not from a single disk, but from multiple disks under some theoretical workload that would utilize them all simultaneously. 

Integrity check and the write daemons are able to drive more IOPS because they use multiple processes and/or asynchronous I/O to put multiple I/Os in flight simultaneously.

Where language, programming skill, and ObjectScript come in to play is in how readily a program that wishes to put multiple I/Os in flight can do so.  ObjectScript enables this, primarily, by giving you controls to start multiple jobs (with the JOB command) and good mechanisms to allow those multiple jobs to cooperate.  For a single process, ObjectScript provides $prefetchon to tell the Cache kernel to do disk prefetching asynchronously on behalf of a single process, but that is restricted to only help in sequential-access-type workloads.

Programming constructs that work at a higher level of abstraction (higher than a single global access) may do some parallelization for you.  Caché has some of these types of things in many different contexts, but %PARALLEL in SQL, and the work queue manager come to mind.  (In SQL Server, you are already programming at this higher level of abstraction and indeed it's not surprising that there's parallelization that can happen without the programmer needing to be aware of it. Under the covers though, this is undoubtably implemented with the sorts of programming constructs I've described: multiple threads of execution and/or asynchronous I/O.)

Of course, how readily a task can be adapted to being parallelized is highly specific to what the task is doing and that is application-specific.  Therefore there are tasks for which this disk array has far more capability than an application may use at a given user load.  However, even an application for which no single task would ever utilize any where this much disk capability, when scaled up to tens of thousands of users, it may indeed want a disk array like this and make use of it quite naturally.  Naturally, not by virtue of the program being written to parallelize an individual tasks, but by having thousands of individual tasks running in parallel.

There is no utility to do this.  You're right that to create such a mechanism is just a matter of manipulating the right bits and bytes just so, but it does mean that you'd lose the guarantee that these are identical copies, so we haven't created one.  The only context in which anything like this is available is the special case of converting shadow systems to mirror systems where we do have a migration utility that doesn't require completely resynchronizing the databases.

This is pretty clearly a mistake in the definition of the Search custom query.  We will look into the history a bit more and correct it. Since the (custom query) Execute method defines the expected arguments, invocation through a resultset works.  Beyond the understandable confusion you had, Mike, it makes sense that this could cause other things not to work like Dmitry illustrates.

You might want to take a look at the List query in %SYS.Journal.Record.  That's a much nicer interface for searching a journal in my opinion.  Also, I suspect you'll find it performs better for most use cases. 

Hopefully someone will chime in with real-life numbers, but I thought it would be helpful to take you through the principles at play to guide your thinking...

1. With any mirror configuration that is going over a WAN (for failover or just DR), you're going to need to ensure sufficient bandwidth to transfer of journals over the network at the peak rate of journal creation.  This is application- and load- specific of course, so this is derived from measuring a reference system running that application.  It's important to base this on peak journal creation, not average journal creation rate, giving plenty of room for spikes, additional growth, etc.

2016.1 introduces network compression for journal transfer and that can substantially reduce bandwidth (70% or more for typical journal contents).  Although it can add a computation latency to the latency you'd consider in #2 below, if you're already going to use SSL encryption, compression may actually save some latency compared to SSL encryption alone.  See documentation on Journal data compression.

2. With failover members in different data centers, latency can be a factor for certain application events.  Specifically it's a factor when an application uses synchronous commit mode transactions or journal Sync() API to ensure that a particular update is durably committed. That requires a synchronous round trip to the backup, which of course incurs any network latency.  This is discussed under Network latency considerations

3. You'll need a strategy for IP redirection when failover occurs. For an intro to the subject, read Mirroring Configurations For Dual Data Centers and Geographically Separated Disaster Recovery.  Then see Mark Bolinsky's excellent article here on the community https://community.intersystems.com/post/database-mirroring-without-virtu....

4. You'll need a location for the arbiter that is in neither of the two data centers as discussed in Locating the Arbiter to Optimize Mirror Availability

I just want to point out that the class thing that folks have been mentioning isn't magical.  It's that RegisteredObject includes %systemInclude and most classes have that in their heirarchy.  I believe that nothing is implicitly included in an Abstract class...