Why so many seconds from one rule pass to another rule?

Question

Question

alex chang · Nov 4, 2020

#Business Rules #Ensemble

Hi

My Ensemble platform worked well before.

Since yesterday , I found out lot of slow request ,I opened trace for one message.

as above shown.

It take about 8s from HisEmrRouter to ADTRoutingRule.

In my opinion, HisEmrRouter And ADTRoutingRule is Simple Rule, should't do IO operation. should be pure compute.

Why does it take so long to process the routing?

What Can I do to avoid it spend too much time to routing?

Tks

Discussion (10)1

Log in or sign up to continue

Julian Matthews · Nov 4, 2020

So if your HisEmrRouter is running with 10 jobs, which then sends to ADTRoutingRule that has half the number of jobs available, then I can see that this would introduce some form of a bottleneck.

It's worth considering the impact of increasing your app pool beyond 1 when working with healthcare data, and the details of which are noted here: https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls...

It also mentioned not having the poolsize exceed the number of CPU (I assume it means CPU cores) , however this has been contested in the past by other users as seen by the comments of Eduard here: https://community.intersystems.com/post/ensemble-introduction-pool-size-...

If FIFO is not required for your use case, I would at the very least try setting the poolsize to the same value.

0 0

score 0 · Answer 1 · 2020-11-04T08:22:11-05:00

Is there anything major within that Router, or the Transforms within the Router?

I have known people to have functions within their transforms and routers, and if you have something like that, the function could have an issue that is adding the delay?

score 0 · Answer 2 · 2020-11-04T09:14:05-05:00

alex chang · Nov 4, 2020

as above shown, transform is empty . no other function

Tks.

0 0

score 0 · Answer 3 · 2020-11-04T09:15:28-05:00

Julian Matthews · Nov 4, 2020

Then I'd look to what Marc mentioned about high volumes of messages and items queuing.

0 0

score 0 · Answer 4 · 2020-11-04T12:23:32-05:00

During a period when messages are processing slowly, can you check the queues page (Ensemble >> Monitor >> Queues) and see if there are messages waiting to be processed?

The next step would be to collect some pButtons data to see if there's a performance bottleneck on the system:
https://community.intersystems.com/post/intersystems-data-platforms-and-...

Even before running the pButtons it would be worth doing a quick check in the OS to see which processes (Ensemble and non-Ensemble processes) are using the most CPU and RAM.

score 0 · Answer 5 · 2020-11-04T08:00:57-05:00

This may not mean that the rules are processing slowly. Is it possible that there were a large number of messages queued for HisEmrRouter?

score 0 · Answer 6 · 2020-11-04T09:20:43-05:00

About 1.5 million messages are processed every day.
This phenomenon only occurs during peak business periods and is more normal during business downturns.
Does this mean that some configuration adjustments to the router are needed to improve its processing power?

HisEmrRouter's number of job: 10

ADT RoutingRule number of job: 5

Tks

score 0 · Answer 7 · 2020-11-04T15:10:00-05:00

4 years later I can safely say that Poolsize can exceed the number of CPU Cores.

The main exception is when a job consumes the entire CPU core. Usually a process mostly waits for network, disk io or some other io or whatever. In that case it's perfectly fine for Poolsize to exceed available CPU Cores - let them wait together. On the other hand if a process consumes a CPU core entirely - in that case Poolsize exceeding CPU cores count won't help.

In general increasing Poolsize overly much can lead to overconsumption of CPU or disk or other resources, which depending on resource concurrency model may (and often does) negatively impact overall performance.

To @alex chang I can recommend monitoring queues count, sql queries, system metrics and cpu/ram/hdd consumption - clearly there's a bottleneck somewhere.

Additionally while Visual Trace is a great tool it does not show everything a process does - only messages sent and received. Is there anything a process does between receiving a request and calling another process?

Finally Monlbl can be used to check which part of the generated BPL code takes a lot of time.

score 0 · Answer 8 · 2020-11-04T09:21:03-05:00

About 1.5 million messages are processed every day.
This phenomenon only occurs during peak business periods and is more normal during business downturns.
Does this mean that some configuration adjustments to the router are needed to improve its processing power?

HisEmrRouter's number of job: 10

ADT RoutingRule number of job: 5

Tks

score 0 · Answer 9 · 2020-11-05T01:28:19-05:00

alex chang · Nov 5, 2020

0 0