Written by

Question Daniel Metcalfe · Oct 7, 2022

Mirror Failover Settings

Hi All,

Our mirrored HealthShare environment has failed over a few times recently due to underlying infrastructure issues (that are being worked on and resolved).

In the HealthShare logs we are seeing:

10/06/22-00:54:35:925 (4736) 1 Journal Daemon has been inactive with I/O pending for 10 seconds:
gjrnoff=524741316,iocomplete=523852600,filecnt=1011,fail=0
10/06/22-00:54:55:086 (4736) 3 CP: Pausing users because Journal Daemon has not shown
signs of activity for 30 seconds. Users will resume if Journal Daemon is active again

My question is:

While the issue is being resolved, is there anywhere I can increase the time HealthShare will wait for the Journal Daemon to become active? We are seeing it come back after around 40 seconds, so 10 seconds to late to prevent the failover.
If so, are then advisories or considerations for upping the limit to say 45 seconds?

Many thanks

Product version: HealthShare 2018.1

Discussion (4)1

Add reply

Comments

Alexander Pettitt · Oct 7, 2022

This is a pretty severe issue with storage. I would examine why the journals are not getting written.
Are you seeing OS errors?

0 0

Daniel Metcalfe · Oct 10, 2022

Thanks for the response Alexander, we aren't seeing any errors in the OS

0 0

Kamal Suri · Oct 11, 2022

You can increase "Quality of Service Timeout (msec)" in mirror properties to give Healthshare more time to wait before doing a failover.

0 0

Daniel Metcalfe Oct 11, 2022 to Kamal Suri

Thanks Kamal, I'm not sure if this is different though? Our Quality of Service Timeout is set to 8000 ms, but the failovers our occurring when the Journal Daemon has been inactive for 30 seconds. Thankfully we haven't seen any more since Friday

0 0