Question
Craig Clifford · Mar 4, 2021

Bouncing interface failure to stop job

We have an interface that need to be disabled then re-enabled when it starts to queue up. I wrote the following code to do this functionality in a process. This works in our development domain, but in production it says it fails to disable the job - it only shuts down the interface without updating the production/starting the interface back up. Error message: "Failed to stop job '36831290' within 60 seconds. Status '<unknown>" 

Is there something wrong with how I'm trying to do this? 

set tSC = ##class(Ens.Director).EnableConfigItem(itemname,0,0) 

set tSC = ##class(Ens.Director).UpdateProduction(60)

set tSC = ##class(Ens.Director).EnableConfigItem(itemname,1,0) 

set tSC = ##class(Ens.Director).UpdateProduction(60)

 

00
2 0 3 116
Log in or sign up to continue

Do you have Jobs per connection or Pool size set to more than one on the interface in Production.

The reason this wasn't working was because we weren't accepting ack's on this specific interface. It was causing a network queue buildup and not allowing a forced disable of the connection. Enabling ack's fixed the issue. 

Hi

May I ask what O/S you are running your live production on? One of the curious things I discovered where Ubuntu (and other Linux products)  differ from Windows is in the way that an Ensemble production will shutdown (and notionally restart). This is especially true if the queues are very long.

On Windows, I was used to setting the Timeout to 60 seconds and sometimes even less and so when I had to develop and deploy an application on Ubuntu I got very confused when we attempted to shut down the production (using 60 seconds) and the whole server would freeze and become unresponsive. At the time I knew nothing about Ubuntu (or Linux in general) and it was only when I got the Ubuntu expert to show me how I could monitor the system during Shutdown and what amazed me is that Ensemble spawned off 10's of jobs (up to 100) which I assumed it had down to try and clear the queues before shutting down and all of these extra processes consumed resources, CPU time and memory and of course the machine ground to a halt. The trick. Give the production at least 120 seconds and maybe even 180 seconds to shut down and you will find that it behaves perfectly correctly.

Yours

Nigel