Ensemble/Healthshare server sizing on Amazon AWS EC2 servers

I am designing the software architecture for an Ensemble/Healthshare production to be deployed on Amazon AWS EC2 servers (2 mirrored m4.large - 4 vCPUs / 16 GiB RAM running RedHat Linux  3.10.0-327.el7.x86_64 and Healthshare for RHEL 64-bit 2016.2.1). It's a rather CPU-intensive production involving massive XSLT 2.0 transformations (massive both in terms of size and volume). I was wondering if anyone has experience configuring Ensemble productions on EC2 servers. My question or concern has to do with the following statement in the Ensemble documentation:

"InterSystems recommends that, as a maximum, you set the Actor Pool Size equal to the number of CPUs in your Ensemble server machine. You could set the number higher, but at any one time there are only as many jobs available as there are CPUs."

Given the presence of hyperthreading, does "number of CPUs" equate to number of vCPUs (4 in our case) or number of virtual cores (2)? I have read some articles in the Developer Community on Caché server sizing in virtualized environments suggesting that hyperthreading should not be taken into account, and for purposes of sizing, consider 1 virtual core to be equivalent to one CPU. However, that article was discussing VMWare, not EC2. I don't know if that should make any difference. I'd appreciate any information/suggestions from people with real-world experience working with production Ensemble/Healthshare healthcare applications deployed on EC2.

Thank you.

Ray Lawrence

Interoperability Team

QuintilesIMS

 

  • 0
  • 0
  • 373
  • 1
  • 1

Answers

Hi Raymond,

Thank you for your question.  I can help with your question.  We have done a lot of testing with EC2, and the performance of an EC2 instance will vary based on an on-demand or reserved instances even of the same EC2 instance type.  In AWS a given EC2 instance type's reported number of vCPU is an individual thread on the processor as a "logical processor".  The OS (and Ensemble/HealthShare as well for that matter) will only see a given instance's number of vCPUs, and the OS will only schedule jobs on those as it sees them. Ensemble and HealthShare are process based - not thread based, so for an instance type of m4.large with 4 vCPUs will mean only 4 jobs in parallel will execute as a time.

In your specific case with the large amount of XSLT parsing and adjusting pool sizes, you will want to first determine if FIFO is a requirement, if so, then unfortunately you need to remain at a pool size of 1 to ensure FIFO.  However, if FIFO is not required in your production or a given Business Service/Process/Operation, you can adjust the pool sizes to values higher than 1 to manage the message queues.  Having a large pool size won't impact the performance or a single XSTL parse, however it will allow for more parallel messages and XSLT parsing.  If you see CPU utilization at 100% and the message queues continual grow, you may need a large EC2 instance type (and larger pool size) to accommodate the message rates.

I hope this helps.  

Kind regards,

Mark B-

Mark,

Thanks for your reply - much appreciated. I actually posted a similar question as a comment on a VMWare sizing post from your colleague, Murray Oldfield. I do recognize that "mileage will vary" and there's probably no substitute for actual benchmarking. However, Murray made the following observation regarding EC2 sizing:

"If you know your sizing on VMware or bare metal with hyperthreading enabled and you usually need 4 cores (with hyperthreading) - I would start with sizing for 8 EC2 vCPUs. Of course you will have to test this before going into production."

Here is the link to Murray's post: https://community.intersystems.com/post/virtualizing-large-databases-vmware-cpu-capacity-planning.  You'll find our discussion thread in the comments at the very end of the post. I just want to make sure that you guys are not saying something different. I got the impression from Murray's article that you can never really have more processes executing at exactly the same instant than the number of physical cores. Which is really tough with EC2, because you never know the number of physical cores anyway, just virtual cores. AWS does state that except for the t2 family and m3.medium, 2 vCPUs = 1 virtual core. Based on Murray's article and his comments, that would lead me to believe that  except for t2 and m3.medium, you can only have one OS process executing at a time for every 2 vCPUs. Am I missing something? I suppose this really revolves around an understanding of how Xeon hyperthreading works more than EC2 topology itself (which I admit I don't have).

We're also very interested in any benchmarking data you have showing differences between dedicated instances vs. default instances (I think you used the terms reserved vs. on-demand but those are just AWS billing modalities).

Regards,

Ray