CP: Pausing users because the Write Daemon has not shown - what does this mean?

Caché

I am getting the following error periodically:

CP: Pausing users because the Write Daemon has not shown

  • 0
  • 0
  • 868
  • 0
  • 2

Answers

In my experience such behavior I saw when we had so many writes to the disk, and write daemon queue only grown up.

In first I would launch mgstat tool from InterSystems. But it should be started before the freezing system, and for quite long time, with at least 5 seconds interval, better if it would be 1 second. Then you should look at some columns such as WDphase, PhyWrs, WDQsz.

What you should look for. 

  • WDphase -  should not be all the time in the 8th phase.
  • WDQsz - from time to time should fall to zero
  • PhyWrs - may help to calculate a "real" speed of writing to the disk. And compare with expected speed.
  • WDPass - time to time should be increased, if WD still at the same cycle so long time, it may cause a freezing system.

 

Dmitry makes some good suggestions above. I would suggest collecting a 24-hour pButtons report on this server (you can also collect a 30 minute one simultaneously so you have some data more quickly) and opening a WRC issue with that data. It will contain the mgstat data reference, along with performance metrics at the OS level, among other things. Performance issues are not generally quick to solve, and take a bit of time reviewing a lot of data before conclusions can be made. 

I will also say that, if this is happening frequently, it's likely your disk is simply too slow for the amount of work the write daemon needs to do.