Creating separate thread in Cache

Has anyone come up with a way to create a separate thread of processing that can achieve shared access to a set of objects created from the initials process?  The situation is this.  There is a large complex set of objects representing a business process.  Some of these objects are in-memory only.  The desire is to spin off a separate thread that could do some ancillary processing on this data set without slowing down the main process.   Any thoughts?

  • 0
  • 0
  • 530
  • 6
  • 0

Comments

The current architecture of Caché is process-based and I don't see an efficient way to achieve what you describe. If the set is huge enough it may pay off to start up a pool processes that can do the post processing. It does depend on the desired output: Another in-memory structure or persisted data.

Anyway, you want to make that you have a pool of processes that work on your set, you don't want to spin them up on demand as this will slow you down.

The work queue manager may be helpful to manage this.

https://community.intersystems.com/post/using-work-queue-manager-process-data-multiple-cores

The work queues do actually sound pretty close to what is being looked for.  The question would be if the worker jobs have access to the in-memory objects of the process initiating the workers? 

Are there any practical examples of using this?

Memory is private to each process. There is no mechanism for processes to share in-memory objects. You would have to persist an object to a global and have the worker process load it, update it, and save it back to the global for the original process to reload.

Correct. That is the reason why this approach only makes sense if the processing takes a significant amount of time. Otherwise, the overhead of dumping the data twice is just not worth it.

Rich, think about it from this prospective: Caché is inherently multiple processes system, if you need concurrency - you invoke new concurrent JOB (or submit order to the worker queue).

But there is shared memory used by whole system - this is called global buffers. If one of the processes will load some block of referenced global to the memory then it's already in shared memory, and another process will find it handily.

So, despite the percepted heavy weight of this advice but using globals as a mean for sharing data is not that bad and is pretty straightforward. Yes, for usual globals there will be extra transactions/journals mechanism involved, and if you really want to avoid them, making globals as much "in-memory" as possible then you could use ^CacheTemp* or ^mtemp globals, which will be mapped to CACHETEMP database, thus not be journalled, e.g. using ^CacheTemp.ApplicationName("pipe") for storing application data will use shared memory, will keep data in memory as long as it's used, and will not be journalled, reducing overhead to the minimum. (But please not forget to use the proper locking discipline, if you will modify this common data from several processes)

P.S.

There is close approximation for this almost in-memory mechanism which is called "process-private variables", which do use CACHETEMP mechanisms for reduced overhead, but which provides extra services with automatic cleanup of used globals once process terminated. 

But the problem is - you could not use PP variables for exchanging data with your children because they are invisible outside of this process. Unfortunately, there is no inheritance mechanism for created process-private storage when children created...

Timur,

Thanks for the feedback.  I have used process private globals in the past.  Unfortunatey this and CACHETEMP will probably not work as the save needs to survive at least an application failure.   If the customer can accept that a full system failure, either Cache or the server itself completely shutdown, the CACHETEMP may work.  They would have to make changes to the application however as some of the objects involved are non-persistent.

Rich