Question
· Nov 4, 2020

Why so many seconds from one rule pass to another rule?

Hi

My Ensemble platform worked well before. 

Since yesterday , I found out lot of slow request ,I opened trace for one message. 

 

 

as above shown. 

It take about 8s from HisEmrRouter to ADTRoutingRule.

 In my opinion, HisEmrRouter And ADTRoutingRule is Simple Rule,  should't do IO operation. should be pure compute. 

Why does it take so long to process the routing?

What Can I do to avoid it spend too much time to routing?

Tks

Discussion (10)1
Log in or sign up to continue

During a period when messages are processing slowly, can you check the queues page (Ensemble >> Monitor >> Queues) and see if there are messages waiting to be processed?

The next step would be to collect some pButtons data to see if there's a performance bottleneck on the system:
https://community.intersystems.com/post/intersystems-data-platforms-and-...

Even before running the pButtons it would be worth doing a quick check in the OS to see which processes (Ensemble and non-Ensemble processes) are using the most CPU and RAM.

So if your HisEmrRouter is running with 10 jobs, which then sends to ADTRoutingRule that has half the number of jobs available, then I can see that this would introduce some form of a bottleneck.

It's worth considering the impact of increasing your app pool beyond 1 when working with healthcare data, and the details of which are noted here: https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls...

It also mentioned not having the poolsize exceed the number of CPU (I assume it means CPU cores) , however this has been contested in the past by other users as seen by the comments of Eduard here: https://community.intersystems.com/post/ensemble-introduction-pool-size-...

If FIFO is not required for your use case, I would at the very least try setting the poolsize to the same value.

4 years later I can safely say that Poolsize can exceed the number of CPU Cores.

The main exception is when a job consumes the entire CPU core. Usually a process mostly waits for network, disk io or some other io or whatever. In that case it's perfectly fine for Poolsize to exceed available CPU Cores - let them wait together. On the other hand if a process consumes a CPU core entirely - in that case Poolsize exceeding CPU cores count won't help.

In general increasing Poolsize overly much can lead to overconsumption of CPU or disk or other resources, which depending on resource concurrency model may (and often does) negatively impact overall performance.

To @alex chang I can recommend monitoring queues count, sql queries, system metrics and cpu/ram/hdd consumption - clearly there's a bottleneck somewhere.

Additionally while Visual Trace is a great tool it does not show everything a process does - only messages sent and received. Is there anything a process does between receiving a request and calling another process?

Finally Monlbl can be used to check which part of the generated BPL code takes a lot of time.