Replies by Mark Bolinsky for InterSystems Developer Community

Mark Bolinsky · Apr 4, 2016

Yes. Latency is a major factor when considering geographically splitting synchronous mirrors. You will need to really understand the given application and workload to know how much latency can be tolerated. Some applications can accept latency (to a certain level) however others may not.

We do have deployments with each synchronous member located in different locations and latency is single digit millisecond latency and only separated by about 100 miles, so there is tolerable latency in this configuration for this application.

Unfortunately there is no absolute formula here to determine if a particular application can leverage that type of a deployment strategy. The first things to consider is monitor the current journal physical write rate of the application with ^mgstat or ^pButtons during peak workloads. You also need to understand if ECP is heavily used because this will have an impact on the the number of journal sync calls for ECP durability guarantees . Usually looking at IO rates with iostat (Linux or UNIX) or PERFMON.EXE (Windows) of the journal volume will give you a good indication of the mirror throughput you will need. Using that figure you can work out what maximum latency should be as a start.

Here is an example:

Say on a given system you see the journal write rate from pButtons/mgstat is relatively low at only 10-20 journal writes per second. Let's assume these are full 64KB journal buffer writes - so bandwidth requirements will be in the neighborhood of 1.3 Mbytes / second (or 10Mbit / second) as a minimum. I would recommend allocating at least 20Mbit or more to ensure spikes can be efficiently handled. However when looking at iostat output you notice the journal volume is doing 200 writes per second because the application is using ECP clients (application servers).

So with this example, we know that at a minimum synchronous mirroring will need at least 20Mbps of bandwidth and latency less than 5 milliseconds. I came to the 5 millisecond requirement by taking 1000 milliseconds (1 second) and divide by 200 journal IOPS. This gives the maximum latency of 5ms to sustain 200 IOPS. This is by no means the absolute requirement for the application. This is a simple starting point to understanding the requirement scope for WAN connectivity, and the application needs to be thoroughly tested to confirm transaction/processing response times are adequate.

I hope this helps.

Regards,

Mark B-

go to post

Mark Bolinsky · Apr 4, 2016

Hi Alexey,

WAN connectivity varies significantly and many factors play into the requirements and latency. You can get very good (fast and reliable) WAN connectivity, however distance impacts latency, so you need to be careful in your planning.

As for deciding which mirror to promote... This is one of the reasons we do not recommend automating the promotion of a DR Async member to become primary. You will want to evaluate the state (or reported latency) within the ^MIRROR utility on each DR Async member to determine which one (maybe both?) are current or not. If they are out of sync with each other, you will need to manually rebuild the "new backup" in the secondary data center based on the newly promoted DR Async member.

Regards,

Mark B-

go to post

Mark Bolinsky · Mar 25, 2016

Hi Francis,

You are absolutely right that memory access performance is vital, however this is not only bandwidth but also latency. With most new systems employing NUMA based architectures, both memory speed and bandwidth have a major impact. This requirement continues to grow as well as because more and more processor cores are crammed into a single socket allowing for more and more concurrently running processes and threads. In additional NUMA node inter-memory accesses plays a major role. I agree that clock speed alone is not a clear indicator of being "the fastest", since clock speeds haven't changed all that much over the years once getting into the 2-3Ghz+ range, but rather items such as overall processor and memory architectures (eg. Intel QPI), on-board instruction sets, memory latency, memory channels and bandwidth, and also on-chip pipeline L2/L3 cache sizes and speeds all play a role.

What this article is demonstrating is not particularly CPU sizing specifics for any given application, but rather mentioning one of (not the only) useful tools comparing a given processor to another. We all agree there is no substitute for real-world application benchmarking, and what we have found through benchmarking real-world application based on Caché that SPECint (and SPECint_rate) numbers usually provides a safe relative correlation or comparison from processor model to processor model. Now things become more complicated when applications might not be optimally written and impose unwanted bottlenecks such as excessive database block contentions, lock contention, etc... from the application. Those items tend to negatively impact scalability on the higher end and would prohibit linear or predictable scaling.

This article is to serve as the starting point for just one of the components in the "hardware food group". So the real proof or evidence is gained from doing proper benchmarking of your application because that encapsulated all components working together.

Kind regards...

go to post

Mark Bolinsky · Mar 21, 2016

Not just for test/dev/demo either... Caché can support highly resilient enterprise applications in cloud.. I recently posted an article how to use database mirroring in a cloud without the built-in Virtual IP (VIP) to provide rapid failover for high availability and disaster recovery - even between availability zones and/or geo-regions.

https://community.intersystems.com/post/database-mirroring-without-virtual-ip-address

go to post

Mark Bolinsky · Mar 9, 2016

ECP clients are "mirror-aware" meaning when you create remote databases on a given ECP client, they are marked as "mirrored". When the ECP client connects to either mirror member it will be redirected to whichever is the active/primary mirror member. It will also reconnect to a new primary member during failover. Our documentation has good detail about this available here:

http://docs.intersystems.com/cache20152/csp/docbook/DocBook.UI.Page.cls?...

Specifically in the Notes: (1)

ECP application servers do not use the VIP and will connect to any failover member or promoted DR member that becomes primary, so the VIP is used only for users' direct connections to the primary, if any.

go to post

Mark Bolinsky · Mar 9, 2016

Hi Alexey,

Thank you for the post on your deployment. I'm very interested to understand more how a virtual router helped in your deployment. If I'm understanding correctly, because of the use of VMware vSphere and the network rules allowing, the use of the actual VIP within database mirroring was used as normal - meaning Cache' was able to remove/assign the VIP to whichever node was the primary mirror member.

As a side note - with ECP clients in the mix, the VIP is not actually a requirement because ECP clients are "mirror-aware" unless some portion of the application needed to access the database server directly.

I'm curious to learn more how you used the virtual router and what components were NAT/PAT'd to and from. For example, did the vRouter sit between a single external address to an internal load balancer or server pool of ECP clients or web servers?

It's great to hear alternatives to solutions. I look forward to hearing back on your deployment.

Kind regards,
Mark B-

go to post

Mark Bolinsky · Mar 8, 2016

Yes. Database mirroring within cloud infrastructure is possible. As you point out the use of the virtual IP address (VIP) in most cases is not doable. This is due to cloud network management/assignments/rules not particularly liking having IP addresses changing outside of the cloud management facilities.

Having said that, the use of 3rd party load balancers offers a solution in the form of a virtual appliance available in most cloud marketplaces in a Bring-Your-Own-License (BYOL) model. As an example F5 LTM Virtual Edition. With these appliances there are usually two methods available to control network traffic flow.

The first option uses an API called from ^ZMIRROR during failover to instruct the load balancer that a particular server is now the primary mirror member. The API methods range from CLI type scripting to REST API integration.

The second option uses load balancer polling to determine which mirror member is primary. This involves creating a simple CSP page or listening socket to respond whether a given server in the load balanced pool is the primary mirror member.

The second option is more portable and load balancer agnostic since it doesn't rely on specific syntax or integration methods from a given load balancer vendor or model. However the limitation is the frequency of polling. In most cases polling can be as low as a few seconds - which in most scenarios is acceptable.

I will be soon posting a long article here on the Community detailing some examples using F5 LTM VE and providing a sample CSP status page and REST API integration to cover both options mentioned above. I will also be presenting a session during our upcoming Global Summit.

go to post

Mark Bolinsky · Feb 3, 2016

One clarification comment I would like to add is the use of "traditional" HugePages through the process of boot-time reservation is still highly recommended for optimal performance . This process is detailed in the Cache' Installation Guide:

http://docs.intersystems.com/cache20152/csp/docbook/DocBook.UI.Page.cls?KEY=GCI_unixparms#GCI_unixparms_huge_page

go to post

Mark Bolinsky · Feb 3, 2016

I tired using the  tag in a recent post, and no matter where I placed it there was no change. Can you provide a snipet showing a sample in the Filtered HTLM editor?