Replies by Mark Bolinsky for InterSystems Developer Community

Mark Bolinsky · Aug 3, 2018

Thanks Thomas. Great article!

One recommendation I would like to add is with VM-based snapshot backups, we recommend NOT including the VM's memory state as part of the snapshot. This will greatly reduce the time a VM will be "stunned or paused" that would potentially bump up close to or exceed the QoS value. Not including the memory state as part of the VM snapshot is OK for the database as recovery never relies on information in memory (assuming the appropriate ExternalFreeze and ExternalThaw APIs are used), since all writes from the database are frozen during the snapshot (journal writes are still occurring).

go to post

Mark Bolinsky · Jun 14, 2018

Hi Paul,

The call-out method is highly customized and depends on the API features of a particular load balancer. Basically the code is to added to the ^ZMIRROR routine to call whatever API/CLI is available from the load balancer (or the EC2 CLI calls).

For the appliance polling method (the one I recommend because it is very simple and clean). Here is a section from my AWS reference architecture article found here. The link also provides some good diagrams showing the usage.

AWS Elastic Load Balancer Polling Method

A polling method using the CSP Gateway’s mirror_status.cxw page available in 2017.1 can be used as the polling method in the ELB health monitor to each mirror member added to the ELB server pool. Only the primary mirror will respond ‘SUCCESS’ thus directing network traffic to only the active primary mirror member.

This method does not require any logic to be added to ^ZMIRROR. Please note that most load-balancing network appliances have a limit on the frequency of running the status check. Typically, the highest frequency is no less than 5 seconds, which is usually acceptable to support most uptime service level agreements.

A HTTP request for the following resource will test the Mirror Member status of the LOCAL Cache configuration.

/csp/bin/mirror_status.cxw

For all other cases, the path to these Mirror status requests should resolve to the appropriate Cache server and NameSpace using the same hierarchical mechanism as that used for requesting real CSP pages.

Example: To test the Mirror Status of the configuration serving applications in the /csp/user/ path:

/csp/user/mirror_status.cxw

Note: A CSP license is not consumed by invoking a Mirror Status check.

Depending on whether or not the target instance is the active Primary Member the Gateway will return one of the following CSP responses:

** Success (Is the Primary Member)

===============================

HTTP/1.1 200 OK

Content-Type: text/plain

Connection: close

Content-Length: 7

SUCCESS

** Failure (Is not the Primary Member)

===============================

HTTP/1.1 503 Service Unavailable

Content-Type: text/plain

Connection: close

Content-Length: 6

FAILED

** Failure (The Cache Server does not support the Mirror_Status.cxw request)

===============================

HTTP/1.1 500 Internal Server Error

Content-Type: text/plain

Connection: close

Content-Length: 6

FAILED

go to post

Mark Bolinsky · Nov 7, 2017

We are receiving more and more requests for VSS integration, so there may be some movement on it, however no guarantees or commitments at this time.

In regards to the alternative as a crash consistent backup, yes it would be safe as long as the databases, WIJ, and journals are all included and have a consistent point-in-time snapshot. The databases in the backup archive may be "corrupt", and not until after starting Caché for the WIJ and journals to be applied will it be physically accurate. Just like you said - a crash consistent backup and the WIJ recovery is key to the successful recovery.

I will post back if I hear of changes coming with VSS integration.

go to post

Mark Bolinsky · Nov 3, 2017

Hi Dean - thanks for the comment. There are no changes required from a Caché standpoint, however Microsoft would need to add the similar functionality to Windows to allow for Azure Backup to call a script within the target Windows VM similar to how it is done with Linux. The scripting from Caché would be exactly the same on Windows except for using .BAT syntax rather then Linux shell scripting once Microsoft provides that capability. Microsoft may already have it this capability? I'll have to look to see if they have extended it to Windows as well.

Regards,
Mark B-

go to post

Mark Bolinsky · Oct 25, 2017

Hi Raymond,

Thank you for your question. I can help with your question. We have done a lot of testing with EC2, and the performance of an EC2 instance will vary based on an on-demand or reserved instances even of the same EC2 instance type. In AWS a given EC2 instance type's reported number of vCPU is an individual thread on the processor as a "logical processor". The OS (and Ensemble/HealthShare as well for that matter) will only see a given instance's number of vCPUs, and the OS will only schedule jobs on those as it sees them. Ensemble and HealthShare are process based - not thread based, so for an instance type of m4.large with 4 vCPUs will mean only 4 jobs in parallel will execute as a time.

In your specific case with the large amount of XSLT parsing and adjusting pool sizes, you will want to first determine if FIFO is a requirement, if so, then unfortunately you need to remain at a pool size of 1 to ensure FIFO. However, if FIFO is not required in your production or a given Business Service/Process/Operation, you can adjust the pool sizes to values higher than 1 to manage the message queues. Having a large pool size won't impact the performance or a single XSTL parse, however it will allow for more parallel messages and XSLT parsing. If you see CPU utilization at 100% and the message queues continual grow, you may need a large EC2 instance type (and larger pool size) to accommodate the message rates.

I hope this helps.

Kind regards,

Mark B-

go to post

Mark Bolinsky · Mar 22, 2017

Hi Mack,

I can help here. The VMS/Itanium system you are migrating from is quite old, and has quite slow processors. For something like this you can figure at least 4 of the McKinley cores (maybe more) to 1 single current model Intel Xeon E5 v4 series core. I would look to using a server such as a single-socket system with an Intel Xeon E5-2667v4 processor and 64GB of RAM (more RAM doesn't hurt either). The E5-2667v4 processor is a 8-core processor @ 3.2Ghz each which is far more CPU than you would need, however it's actually quite difficult to get a smaller server theses.

For a workload like this, a virtual machine in vSphere, Hyper-V, or KVM would probably be more appropriate.

Also, I have a few comments on your current Caché configuration:

The amount of routine buffers configured you have configured (3584MB) exceeds the maximum allowed (max is only 1023MB). You can confirm in your cconsole.log that startup actually reduced to the max value. You will want to update your routine cache size to 1023MB so that it takes effect on the next Caché restart.
I see you have 512MB of 2KB database buffers allocated and 43496MB of 8KB buffers. I would suggest removing the allocation of the 2KB buffers completely and just allow any 2KB databases you have to use the 8KB buffers. That way you aren't artificially capping your database cache.
Speaking of 2KB databases, If you still actually have 2KB databases on your system, it is highly recommended to convert those to 8KB databases for data safety and performance reasons.

Kind regards,

Mark B-

go to post

Mark Bolinsky · Feb 22, 2017

I will revise the post to be more clear that THP is enabled by default in 2.6.38 kernel but may be available in prior kernels and to reference your respective Linux distributions documentation for confirming and changing the setting. Thanks for your comments.

go to post

Mark Bolinsky · Feb 22, 2017

Hi Alexander,

Thank you for you post. We are only relying on what RH documentation is stating as to when THP was introduced to the main stream kernel (2.6.38) and enabled by default as noted in the RH post you referenced. The option may have existed in previous kernels (although I would not recommending to try it), it may not have been enabled by default. All the documentation I can find on THP support in RH references the 2.6.38 kernel where is was merged feature.

If you are finding it in previous kernels, confirm that THP are enabled by default or not. That would be interesting to know. Unfortunately there isn't much we can do other than to do the checks for enablement as mentioned in the post. As the ultimate confirmation, RH and the other Linux distributions would need to update their documentation to confirm when this behavior was enacted in the respective kernel versions.

As I mentioned in other comments, the use of THP is not necessarily a bad thing and won't cause "harm" to a system, but there may be performance impacts for applications that have a large amount of process creation as part of their application.

Kind regards,

Mark B-

go to post

Mark Bolinsky · Feb 22, 2017

Hi Alexey,

Thank you for your comment. Yes, both THP and traditional/reserved Huge_pages can be used at the same time, however there is not benefit and in fact systems with many (thousands) of Caché processes, especially if there is a lot of process creation, has shown a performance penalty in testing. The overhead of instantiating the THP for those processes at a high rate can be noticeable. Your application may not exhibit this scenario and may be ok.

The goal of this article is to provide guidance for those that may not know which is the best option to choose and/or point out that this is a change in recent Linux distributions. You may find that THP usage is perfectly fine for your application. There is no replacement for actual testing and benchmarking your application. :)

Kind regards,

Mark B-

go to post

Mark Bolinsky · Oct 25, 2016

Hi Anzelem,

Here are the steps that need to be defined in your VCS cluster resource group with dependencies.

Remount the storage <— this is not new
Relocate the cluster IP <— this is not new
Simple VCS application/script agent to restart the ISC Agent < — THIS IS NEW
ISC VCS cluster agent to start Caché < — this is not new (make the previous step a dependency before executing)

The script to start the ISCAgent would be dependent on the storage being mounted in the first step.

This should provide you with the full automation needed here. Let me know if there any any concerns or problems with the above steps.

Regards,

Mark B-

go to post

Mark Bolinsky · Aug 2, 2016

Thank you for your question. It is recommended with any InterSystems 2014.1 product (including Caché, Ensemble, or HealthShare) version to remain using SMT4 (or SMT2). Not until running a version based on 2015.1 or higher would SMT8 be advisable and provide any potential gain.

go to post

Mark Bolinsky · Jul 26, 2016

Thank you for your comment. You will need to establish you own monitoring and ultimately range of IO response times for your application using tools like iostat. This article is used to give you a starting point for monitoring. Your specific application may need higher or lower requirements.

Using iostat, you want to continuously monitor storage device performance (specifically the iostat -x <device> <time between sample in seconds> <number of iterations> command) and monitor it for a particular range of time. For example, if you want to only monitor during peak business hours from 8am-12pm. What is mostly important is average response times - typically I like using iostat -x <devices> 2 1000 to report 1000 2-second samples. This is useful when diagnosing a performance issue.

To reduce the amount of data collected you can use a higher time between samples such as iostat -x <devices> 5 1000 for 5 second samples or even higher if you wish. It's really a function of what reasons you are monitoring - if doing an in-depth performance analysis you would want a small time between samples to better observe spikes in response times, or if you are doing just daily statistic collection you could go for a higher time between samples. The objective here is to get familiar with your specific application's needs and this article just provides a baseline for what is typical for most applications.

Kind regards,

Mark B-

go to post

Mark Bolinsky · Jul 8, 2016

Hi Ron,

There are many options available for may different deployment scenarios. Specifically for the multi-site VPN you can use the Azure VPN Gateway. Here is a diagram provided by Microsoft's documentation showing it.

Here is the link as well to the multi-site VPN details.

As for Internet gateways, yes they have that concept and the load balancers can be internal or external. You control access with network security groups and also using the Azure Traffic Manager and also using Azure DNS services. There are tons of options here and really up to you and what/how you want to control and manage the network. Here is a link to Azure's documentation about how to make a load balancer Internet facing.

The link to the code for some reason wasn't marked as public in the github repository. I'll take care of that now.

Regards,

Mark B-

go to post

Mark Bolinsky · Jul 8, 2016

Hi Matthew,

Thank you for your question. Pricing is tricky and best discussed with your Microsoft representative. When looking at premium storage accounts, you only pay for the provisioned disk type not transactions, however there are caveats. For example if you need only 100GB of storage will be be charges for a P0 disk @ 128GB. A good Microsoft article to help explain the details can be found here.

Regards,

Mark B

go to post

Mark Bolinsky · Jun 17, 2016

setting the TZ environment variable needs to be done in the system-wide profile such as /etc/profile. This should define it properly for you. I would recommend a restart of Caché after setting it /etc/profile.

Also the impact of the TZ environment variable not being set should be reduced (eliminated) with the current 2016.1+ releases where we have changed the way this operates.

Kind regards,
Mark B-

go to post

Mark Bolinsky · May 31, 2016

Hi Steve,

There are multiple ways to accomplish this and really depends on the security policies of a given organization. You can do as you have outlined in the original post, you can do as Dmitry has suggested, or you can even take it a step further and provide an external facing DMZ (eDMZ) and an internal DMZ (iDMZ). The eDMZ contains only the load balancer with firewall rules only allowing HTTPS access to load balance to only the web servers in the iDMZ, and then the iDMZ has firewall rules to only allow TLS connections to the super server ports on the APP servers behind all firewalls.

Here is a sample diagram describing the eDMZ/iDMZ/Internal network layout.

So, as you can see there are many ways this can be done, and the manner in which to provide network security is up to the organization. It's good to point out that InterSystems technologies can support many different methodologies of network security from the most simple to very complex designs depending on what the application and organization would require.

Kind Regards,

Mark B

go to post

Mark Bolinsky · May 19, 2016

Thanks for the clarification and confirmation. Very helpful.

go to post

Mark Bolinsky · May 5, 2016

Hi all, I'd like to offer some input here. Ensemble workloads are traditionally mostly updates when used as purely message ingestion, some transformations, and outbound to one or more outbound interfaces. As a result, expect to see low Physical Reads rates (as reported in ^mgstat or ^GLOSTAT), however if there are additional workloads such as reporting or applications built along with the Ensemble productions they may have do a higher rate of physical reads.

As a general rule to size memory for Ensemble we use 4GB of RAM for each CPU (physical or virtual CPU) and then use 50-75% of that RAM for global buffers. So in a 4 core system, the recommendation is 16GB of RAM with 8-12GB allocated to the global buffers. This would leave 4-8GB for OS kernel and Ensemble processes. When using very large memory configurations (>64GB), using the 75% rule rather than only 50% is ideal because the OS kernel and processes won't need so much memory.

One additional note is we highly recommend the use of huge_pages (Linux) or Large_pages (Windows) to provide a much more efficient memory management.

go to post

Mark Bolinsky · Apr 10, 2016

Hello,

I cannot name specific customers, however this is a configuration used with TrakCare and TrakCare Lab deployments (prior to TrakCare Lab Enterprise which now integrates lab directly as a module into a single TrakCare instance), where each the TrakCare and TrakCare Lab are separate failover mirror sets and TrakCare Analytics is defined as a single Reporting Async mirror member to be the source data to build/support the TrakCare Analytics DeepSee cubes and dashboards in a single instance.

This is our standard architecture for TrakCare based deployments. I hope this helps. Please let me know if there are specific questions or concerns with this deployment model.

King regards,

Mark B-

go to post

Mark Bolinsky · Apr 5, 2016

Hi Alex,

You are correct that latency is only a major consideration for synchronous (failover) mirror members. In an async member, latency to/from the primary mirror member does not slow down the primary mirror member processing. Like you mentioned it only impacts the delay in the async mirror member being "caught up". Your example is perfectly fine for DR Async, and if the DR Async should fall behind for any reason, it will put itself into "catch up mode". In all cases this does not impact the primary mirror member performance.

I'd like to mention that in DR Async mirror members we also use compression as a means to be sensitive to bandwidth requirements, so if sizing a WAN link for DR Async consider that the bandwidth requirements will be less due to compression.

As for cascading mirrors, that currently is not a feature we support today.

Thanks again for your excellent questions.

Kind regards,
Mark B-