Well, if you don't use "RemainAfterExit=yes", what happens is that systemd sends a signal to the cache main process and Cache shuts down right after it started.

Systemd's approach to "forking" expects the command to exit after having the main process forked and detached. There is probably something that doesn't fulfill systemd's expectations which makes it believe the process has not forked or there may be something remaining.

RemainAfterExit simply tells systemd that it's OK for the started process to remain running.

My example is indeed intended for the classical use of one instance.

I think the simplest way to solve the reporting issue is to provide a PID file. If Cache provides one the systemd can check for a running process under that PID.

PIDFile=
Takes an absolute file name pointing to the PID file of this daemon. Use of this option is recommended for services where Type= is set to forking. systemd will read the PID of the main process of the daemon after start-up of the service. systemd will not write to the file configured here, although it will remove the file after the service has shut down if it still exists.

Another possible solution is to have a switch telling Cache not to fork and then run the service using Type=simple.

I think the average system administrator is aware of fact that changes to the service's state not made by systemd might not reflect well on systemd. The same behavior occurs when you use Apache's or Postfix's service initializations commands directly.

You might have noticed my reference to these tests in past tence as a thing that happened.

I ran these tests during a POC so I am not sure we have the storage controllers to test against. I already spoke to the client. They will be deciding on the storage system to purchase quite soon.

This type of equipment costs quite a lot and we don't see too many of these deployed at customers. The client is willing to postpone his deployment for a day or two. I can use that time to collect data as you require or have a remote session where you or someone else can see it first hand. I think remote sessions are better than data exchange because I have limited resources on this issue and only a day or two to exchange metrics.

 

Well, my test was parallelized, and I have to say I did mention it.

In my test I took a 160GB global and wrote a simple process that starts at a random point from that global, spans forward with a $O run, advancing by about 200 records each 10 records read (to force a new seek out of the current block), reads the node at the current iterator and optionally writes the same data on it. My aim was to squeeze as much as IOPS I can get from the storage controller. I ran multiple jobs of that process experimenting with the numbers by changing the number of read and R/W processes.

The machine had 12 CPUs. I have achieved the best performance (and highest IOPS from the storage controller) at twice as many jobs than CPUs of that machine - 24 jobs. The controller reported roughly 8,000 IOPS while the virtual machine's CPU power was exhausted. This result of 8,000 IOPS was only achieved after optimizing the loop to include as few lines as I can get (I think the exact number was 6).

Only then I realized that I can't really squeeze anymore from the storage controller with the given server (a 3-year old IBM x3550 M4) with ObjectScript code. Multi-threaded C/C++ code will do that as you also agreed with the IC example.

So the case is really a program that has to run on a huge global, much bigger than the server's global buffers and a storage controller with very low latency and high seek rate (because it uses SSDs). The problem as I see it, is even if one does split the workload into several processes - it would still be ObjectScript (interpreted) code that executes it. It would be the same with SQL because all queries are compiled into mac routines (which is interpreted code) who execute the query.

I did revert to doing only read tests after having understood the issues I'm having with the write daemon.

I'm note sure how global prefetching is going to help because I'm trying to get as random as I can. I'm intentionally trying to override the global buffers and cache internal optimizations in order to utilize the storage as much as I can.

My tests were against a 1TB database running on a 160GB global. The test machine RAM amounted up to 16GB.