Timeout for $zf
In one of the projects, when we have ECP with 10 ECP application servers, from time to time we faced the issue when our journals fail to purge, due to open transactions. While we have about 100-150 GB journal files per day, it quite quickly became a big issue, and with mirroring a very big issue. Mostly we just rebooted our ECP Data server, so it searches rollbacks any transactions, but such process is too long, may steal a few hours. I did not find any way, how to get the list of the open transactions from one place from ECP Data Server. We just migrated our Data server to 2018.1. while our App servers still on 2012.2 due to some reasons and we can't migrate them.
I found useful query %SYS.Journal.Transaction:List, but 2012.2 not have it, and it useless in ECP, works only for local processes (or due to outdated App servers). So, I had to connect to each server, and with %SYS.ProcessQuery, find any process with open transactions, and I found it. CSP Session process, which called some of our external tool with $ZF(-1), hangs on this line for a few days already due to errors in the external tool.
Looks like $zf(-1), or even newest $zf(-100) does not offer any timeout options. What would you recommend to do in this case, how to prevent $zf(-1) to hang for a few days, and limit it by minutes?
If anybody can say how possible to easily monitor when transaction was opened for a long time, and where it was started in case of ECP configuration?