· Jun 15, 2016

Cluster deployment recipes?

Let assume you have a infinitely scaling algorithm implemented in your application, using replication, ECP, or any other means of horizontal scaling, and let assume you know how to run your system under any volume of requests, the trick is to deploy required number of computing nodes in the cluster. If we are talking about cluster of 2-4 nodes your administrator (or as they call it today "devops engineer") will install anything manually. Probably he will easily handle 5 nodes configuration in the cluster. But what if you need to deploy 10, 100, or 200 properly configured, interconnected nodes?

Devops are lazy, if there is some easy way to automate task then they will automate, even if there is need to use newer and fancy tool. [Though requirement here, this new and fancy tool shold be easy to learn: they could spend a couple of hours reading StackOverflow or vendor forums, but certainly have no times for several days. Thus simpler way usually wins]

So, it's easy to replicate nodes in the cluster (be it on-premise or in the cloud) if nodes are identical. vSphere cmdlet, docker hub images, Digital Ocean droplet, or AWS template all allow to make new, identical instance to be up'n'running in a few moments.

But for ECP I need further fune-tuning during configuration time:

  • let assume I've implemented map-reduce controller for which we need to instantiate 1 master node, and N worker nodes.
  • Each worker node should mount remotely, via ECP, some particular database from master, where control code and control global are residing.
  • In reverse master should mount remotely, via ECP, each worker database, mapping particular global subscript from this database (let's name it "backward data channel").

For 2-3 nodes in the cluster we could make those changes in cache.cpf manually, but for larger scale we should probably use some wiser provisioning tools. Looks like Ansible, as described by Murray before, is a good candidate for such task, and is able to provision images created very flexibly.

But, being lazy and having no experience with Ansible, I'd prefer to see easy to hack recipe. So, have you already cooked anything similar, which will be able to easy configure ECP interconnect while instantiating multiple nodes in the cluster? Anything for vSphere, Digital Ocean, Amazon? 

Discussion (8)1
Log in or sign up to continue

1) if they are DevOps engineers they don't do things manually. It's an axiom. :-)

2) the issue IMO is not so much which tool one needs to pick -you mention Ansible, but understand the requirements of the architecture.

3) it's not so much the numbers of nodes -although important, but the fact that the software needs to be much more dynamic and provide clear APIs that one can call. With a static architecture the monolith lives on ;)

4) one very important fact when working in the cloud is networking: you may not know in advance some facts like IP addresses. In general apps use discovery services to find things out. This is the part we need to grow and provide more "dynamicity". You can, however, adapt and evolve via ZSTART and tell the cluster who you are and you you're after.

5) I was playing with this few days ago and because of the lack of API I had to create one of those horrible scripts with <CR> redirection, blind options selections etc. ;) - Not the most robust solution for Dev or Ops engineers ;)

HTH in your endeavour Timur

As I've already had some experience in such systems in production, and one of our projects, has similar architecture exclude docker, just on windows, some physical servers with ECP-Client + CSPGateway + Apache, and one HAPorxy server for all of this server. And in this case all this scheme is quite static, and adding new ECP-client means some manual works, on all levels. But with docker I expect, just call something like this command 

docker-compose scale ecp=10

and just get some new working instances, which just after gets their new web-clients

To work it as microservices and split CSPgateway and Cache instance as different containers, I need to have simple package just only with CSPGateway, but FieldTest versions does not contain it. But yes sure, I think it is good way too, and in this case I can have more WEB-containers then Cache does, if it would be needed.

Let put aside software architecture (I'll write later some number of articles abut what I mean here), let talk about dirty details. 

If you have any oncrete details about the way you use Swarm, Ansible, Chef, or similar, then I (and community) will highly appreciate.


It will simplify things a lot if we could configure ECP mapping at the runtime via some set of API calls, and not statucally via editing cache.cpf. Something like it's done in MongoDB for adding new shard:


But not for the scenario of adding shard to shard-manager in particular, but for something more generic for ECP or mapping. I suspect there is something related already implemented for EM, but I have no clue how to use it for my case.


And I know there is already implemented AssignShards call in the forthcoming product, but it's too much specific, creating particular set of mappings. I'd need to have it more generic. 

FWIW this is a huge chapter and we are not even mentioning global mapping or SLM (subscript level mapping), how dynamic you might want to be with it and what kind of elasticity you are after (scaling out and back in). You might find that ECP is not really your friend right here and right now given your cloud-based requirements. ECP is however extremely good for what it was created to do. 


To answer your question, I worked with Terraform that allows you to create any infrastructure on just about any IaaS cloud. You can then finely customise your nodes by injecting any script you may want. No need for any other configuration management (CM) tool. It has variables interpolation so it knows to wait to get the info it needs after resources are created etc. etc.


Timur, you can look at my example at github which I wanted to use in to article about using docker but have not managed yet. In this example I have Dockerfile for ECP-client, which can be build for particular ECP-server. And with %Installer manifest possible to make backward data channel too, while we have already known about where our server placed, we can connect to him via %Net.RemoteConnect or something else and make new backward connection, the problem is in this case how to remove old ones. I played only on one machine. But in anyway, Ansible still could be useful, but in case to prepare servers to work in docker cluster,  which should be prepared before we could use docker-compose and so on. And My example also contains web-server (apache), to have an access to this new instance. What I wanted to do then is to use some load balancer, HAProxy or traefik (as recommended Luca), to get one point access to my application, and dynamically expandable, without any manual operations, except scaling.

Thanks, Dima, [I did expect you will publish it] and this advice is very interesting and easier to apply by "lazy devops engineer". Though some explanations and comments won't harm. Hope you'll find some time eventually to write article. 


Could not resist and not say my few notes about your docker file:

- from pure micro-services point of view for the generic case of multiple ECP clients it makes no much sense IMHO to install csp gateway to each of instantiated docker instances;

- I'd invoke it at the master (ECP database server) instance, or probably as separate docker image;

- [though I suspect, that for HAproxy scenario you might needed to have this CSP-gateway services spread over each instance just for high-availability scenario. I'll be curious that Luca would recommend here from micro-services prospective?]


Is there something unique in the ECP server? Can you create one OS image with Caché installed the ECP service started, correct mapping and the namespace etc  as a template?

The application server will connect automatically to the named Data server on Caché startup?

So on Application server: