Article

Murray Oldfield · Jun 19, 2024 14m read

#Continuous Delivery #System Administration #InterSystems IRIS

I have created some example Ansible modules that add functionality above and beyond simply using the Ansible builtin Command or Shell modules to manage IRIS. You can use these as an inspiration for creating your own modules. My hope is that this will be the start of creating an IRIS community Ansible collection.

I am assuming you are already familiar with Ansible. For example see the community post: Ansible Community Post
This post follows up on my 2024 Global Summit presentation.
The source code and demo scripts are available on GitHub here: GitHub

I expect some editing and changes during the few weeks after the Global Summit (June 2024) so check GitHub.

Ansible modules

To give you an idea of where I am going with this, consider the following: There are many (~100) collections containing many 1,000s of Ansible modules. A full list is here: https://docs.ansible.com/ansible/latest/collections/index.html

By design, modules are very granular and do one job well. For example, the built-in module ansible.builtin.file can create or delete a folder. There are multiple parameters for setting owner and permissions, etc., but the module focuses on this one task. The philosophy is that you should not create complex logic in your Ansible playbooks. You want to make your scripts simple to read and maintain.

You can write your own modules, and this post illustrates that. Modules can be written in nearly any language, and they can even be binaries. Ansible is written in Python, so I will use Python to handle the complex logic in the module. However, the logic is hidden within a few lines of YAML that the user interacts with.

How to stop and start IRIS using the Ansible built-in `command` module

You can start or stop IRIS using the built-in command module in an Ansible task or play. The command module runs command line commands with optional parameters on the target hosts. For example:

- name: Start IRIS using built-in command  
  ansible.builtin.command: iris start "{{ iris_instance }}"
  register: iris_output  # Capture the output from the module  

- name: Display IRIS Start Output test 1  
  ansible.builtin.debug:  
    msg: "IRIS Start Output test 1: {{ iris_output.stdout }}"  
  when: iris_output.stdout is defined  # Ensures stdout is displayed only if defined

"{{ iris_instance }}" is variable. In this case, "iris_instance" is the instance name set sometime earlier. For example, it could be "IRIS" or "PRODUCTION", or anything else. Variables are a way to make your scripts reusable. Using register: iris_output will capture stdout from the command into the variable "iris_output"; we can display or use the output later.

If successful, the output "msg" is the same as if you had run the command on the command line. For example, it will be like this:

TASK [db_server : Display IRIS Start Output test 1] ***********************************************************
ok: [dbserver1] => {
    "msg": "IRIS Start Output test 1: Using 'iris.cpf' configuration file\n\nStarting Control Process\nAllocated 4129MB shared memory\n2457MB global buffers, 512MB routine buffers\nThis copy of InterSystems IRIS has been licensed for use exclusively by:\nISC TrakCare Development\nCopyright (c) 1986-2024 by InterSystems Corporation\nAny other use is a violation of your license agreement\nStarting IRIS"
}

If there is an error, for example, IRIS is already started, the instance name is wrong, or for some other reason, the return code is not 0, the playbook will fail, and no additional tasks will run, which is not ideal. There are ways to manage failure in Ansible scripts, but that will get messy and more complicated to manage. For example, your playbook will come to a halt with this error message if IRIS is already started. Note failed=1 in the PLAY RECAP below.

TASK [db_server : Start IRIS using built-in command] **********************************************************
fatal: [dbserver1]: FAILED! => {"changed": true, "cmd": ["iris", "start", "IRIS"], "delta": "0:00:00.014049", "end": "2024-05-09 03:47:05.027348", "msg": "non-zero return code", "rc": 1, "start": "2024-05-09 03:47:05.013299", "stderr": "", "stderr_lines": [], "stdout": "IRIS is already up!", "stdout_lines": ["IRIS is already up!"]}

PLAY RECAP ****************************************************************************************************
dbserver1                  : ok=2    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0

Sidebar: If you have used IRIS for a while, you may ask: Why do you not recommend Running IRIS as a systemd service on Unix? I recommend that, but that will get in the way of my current story! But, as you will see, there are reasons to start and stop IRIS manually. Read on :) I have an example playbook and templates for setting up IRIS as a service in the demo examples that accompany this post.

How to use a custom module to start and stop IRIS

The custom Ansible module to start and stop IRIS gracefully handles errors and output. Apart from having cleaner and easier-to-maintain playbooks, here is some background on why you might do this.

Sidebar: Hugepages

Linux systems use pages for memory allocation. The default page size is 4KB. Hugepages are larger memory pages, typically 2MB by default, but can be more. By reducing memory management overhead, hugepages can significantly improve database performance. Hugepages are large, contiguous blocks of physical memory that a process must explicitly use. Using hugepages is good practice for IRIS databases that keep frequently accessed data in memory.

So, hugepages are good for IRIS database servers. The best practice for IRIS is to put RIS shared memory in hugepages, including Global buffers, routine buffers, and GMHEAP (add link). A common rule of thumb for a database server is to use 70-80% of memory for IRIS shared memory in hugepages. Your requirements will vary depending on how much free memory you need for other processes.

Estimating the exact size of IRIS shared memory is complex and can change depending on your IRIS configuration and between IRIS versions.

You want to be as close as possible to the shared memory size IRIS uses when configuring the number of huge pages on your system.

If you allocate too few hugepages for all IRIS shared memory, by default IRIS will start up using standard pages in whatever memory is left, wasting the memory set aside for hugepages!
- This is a common cause of application performance issues. By default, IRIS keeps downsizing buffers until it can start using the available memory (not hugepages). For example, starting with smaller global buffers increases IO requirements. In the worst case, over time, if not enough memory is available, the system will need to page process memory to the swap file, which will severely impact performance.
If you allocate more hugepages than IRIS shared memory, the remainder is wasted!

The recommended process after installing IRIS or making configuration changes that affect shared memory is:

Calculate the memory that is required for IRIS and other processes. If in doubt, start with the common 30% other memory / 70% for shared memory rule mentioned above.
Calculate the major structures that will use shared memory from the remainder of memory: Global buffers (8K, 64K, etc.), Routine buffers, and GMHeap.
- See this link for more details.
Update the settings in your IRIS configuration file.
Stop and start IRIS and review the startup output, either at the command line or in messages.log.
Make the OS kernel hugepages size change based on the actual shared memory used.
Finally, restart IRIS and make sure everything is as you expect it to be.

An Ansible workflow to right-size hugepages

You may be sizing hugepages and shared memory because you have just installed IRIS, upgraded, or increased or decreased the host memory size for capacity planning reasons. In the example above we saw that if the start command is successful there is some useful information returned, for example, the amount of shared memory:

Allocated 2673MB shared memory
1228MB global buffers, 512MB routine buffers

Ansible playbooks and plays are more like Linux commands than a programming language, although they can be used like a language. In addition to monitoring and handling errors, you could capture the output of the iris start command inline in your playbook and process it to make decisions based on it. But that all gets messy and breaks the DRY principle we should be aiming for when building our automation.

The DRY principle stands for "Don't Repeat Yourself." It is a fundamental concept in software development aimed at reducing the repetition of software patterns, for example, by replacing them with abstractions, which we will now do.

I have created several custom Ansible modules. The Ansible term is a collection. This is the start of an open-source IRIS collection. The source and examples are here: GitHub.

The IRIS modules collection for these demos is in the /library folder.

Example playbooks

The module iris_start_stop is used like any other Ansible module. The stanzas in the following playbook extract:

Run the custom iris_start_stop module. In this case, stop if already running and restart.
- Register (or store) output in a JSON object in a variable named iris_output.
As a demo display the stdout part as a message.
As a demo display the stderr part as a message.
Display an additional part named memory_info as a message.

- name: Stop IRIS instance test 1  
  iris_start_stop:  
    instance_name: 'IRIS'  
    action: 'stop'  
    quietly: true  
    restart: true  
  register: iris_output  # Capture the output from the stop command  

- name: Display IRIS Stop Output test 1  
  ansible.builtin.debug:  
    msg: "IRIS Stop Output test 1: {{ iris_output.stdout }}"  
  when: iris_output.stdout is defined  # Display stdout from stop command  

- name: Display IRIS Stop Error test 1  
  ansible.builtin.debug:  
    msg: "IRIS Stop Error test 1: {{ iris_output.stderr }}"  
  when: iris_output.stderr is defined  # Display stderr from stop command  

- name: Display IRIS Stop memory facts test 1  
  ansible.builtin.debug:  
    msg: "IRIS Stop memory facts test 1: {{ iris_output.memory_info }}"  
  when: iris_output.memory_info is defined

Example output shows memory_info is returned as a Python dictionary (key: value pairs).

TASK [db_server : Stop IRIS instance test 1] ************************************************************************************************************
changed: [monitor1]

TASK [db_server : Display IRIS Stop Output test 1] ******************************************************************************************************
ok: [monitor1] => {
    "msg": "IRIS Stop Output test 1: Starting Control Process\nAllocated 4129MB shared memory\n2457MB global buffers, 512MB routine buffers\nThis copy of InterSystems IRIS has been licensed for use exclusively by:\nISC TrakCare Development\nCopyright (c) 1986-2024 by InterSystems Corporation\nAny other use is a violation of your license agreement\nStarting IRIS"
}

TASK [db_server : Display IRIS Stop Error test 1] *******************************************************************************************************
ok: [monitor1] => {
    "msg": "IRIS Stop Error test 1: "
}

TASK [db_server : Display IRIS Stop memory facts test 1] ************************************************************************************************
ok: [monitor1] => {
    "msg": "IRIS Stop memory facts test 1: {'shared_memory': 4129, 'global_buffers': 2457, 'routine_buffers': 512, 'hugepages_2MB': 2106}"
}

As you can see, stdout displays the startup message.

However, if you look closely at the memory_info output you can see that the information has been put in a dictionary, which will be useful soon. It also contains the key: value pair 'hugepages_2MB': 2106

Starting and stopping IRIS using an Ansible IRIS module means that a system administrator using Ansible doesn't need to create complex playbooks to handle error checking and calculations or even have a deep understanding of IRIS. The details of how that information was extracted during startup are hidden, as is the calculation of the number of hugepages required for the actual shared memory used by IRIS.

Now that we know the hugepages requirements, we can go on and:

Create a playbook to configure hugepages.

The complete playbook is below.

Stop IRIS if its running and restart to capture shared memory.
Loop over the memory_info dictionary and create variables from key: value pairs.
Stop IRIS.
Set hugepages using sysctl to set hugepages passing the hugepages variable. Note: this step is in its own playbook (DRY principles again).
Start IRIS.

---  
- name: IRIS hugepages demo  
  ansible.builtin.debug:  
    msg: "IRIS Set hugepages based on IRIS shared memory"  

# Stop iris, in Ansible context "quietly" is required, else the command hangs  
# iris stop has output if there is a restart, use that to display changed status  

- name: Stop IRIS instance and restart  
  iris_start_stop:  
    instance_name: 'IRIS'  
    action: 'stop'  
    quietly: true  
    restart: true  
  register: iris_output  # Capture the output from the stop command  

- name: Set dynamic variables using IRIS start output  
  ansible.builtin.set_fact:  
    "{{ item.key }}": "{{ item.value }}"  
  loop: "{{ iris_output.memory_info | ansible.builtin.dict2items }}"  

# Stop IRIS  

- name: Stop IRIS instance  
  iris_start_stop:  
    instance_name: 'IRIS'  
    action: 'stop'  
    quietly: true  

# Set hugepages  

- name: Set hugepages  
  ansible.builtin.include_tasks: set_hugepages.yml  
  vars:  
    hugepages: "{{ hugepages_2MB }}"  

# Start, quietly. 

- name: Start IRIS instance again  
  iris_start_stop:  
    instance_name: 'IRIS'  
    action: 'start'  
    quietly: true  
  register: iris_output  # Capture the output from the module

The following is the output from the playbook.

TASK [db_server : IRIS hugepages demo] **********************************************************************
ok: [dbserver1] => {
    "msg": "IRIS Set hugepages based on IRIS shared memory"
}

TASK [db_server : Stop IRIS instance and restart] ***********************************************************
changed: [dbserver1]

TASK [db_server : Set dynamic variables using IRIS start output] ********************************************
ok: [dbserver1] => (item={'key': 'iris_start_shared_memory', 'value': 4129})
ok: [dbserver1] => (item={'key': 'iris_start_global_buffers', 'value': 2457})
ok: [dbserver1] => (item={'key': 'iris_start_routine_buffers', 'value': 512})
ok: [dbserver1] => (item={'key': 'iris_start_hugepages_2MB', 'value': 2106})

TASK [db_server : Stop IRIS instance] ***********************************************************************
changed: [dbserver1]

TASK [db_server : Set hugepages] ****************************************************************************
included: /.../roles/db_server/tasks/set_hugepages.yml for dbserver1

TASK [db_server : Set hugepages] ****************************************************************************
ok: [dbserver1] => {
    "msg": "Set hugepages to 2106"
}

:
:
:

TASK [db_server : Start IRIS instance again] ****************************************************************
changed: [dbserver1]

PLAY RECAP **************************************************************************************************
dbserver1                  : ok=13   changed=4    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0

Note that I have edited out the "set hugepages" playbook output. The short story is that the playbook sets hugepages depending on the OS type. Suppose contiguous memory is unavailable, and the required number of huge pages cannot be set. In that case, the server is rebooted (you might defer this if this process is part of an initial build). The reboot and wait for the server to be available before continuing. The reboot steps are skipped if memory is available.

Running IRIS as a systemd service

Managing an InterSystems IRIS database instance as a systemd service on systems that use systemd (like most modern Linux distributions: Red Hat, Ubuntu, etc.) offers several advantages regarding consistency, automation, monitoring, and system integration. The main reason I recommend using systemd is that systemd allows you to configure services to start automatically at boot, which is crucial for production environments to ensure that your database is always available unless deliberately stopped. Likewise, it ensures the database shuts down gracefully when the system is rebooting or shutting down.

An example is at this link: iris_start_stop_systemd.yml

You can also manually start and stop IRIS while running as a service.

Running `qlist`

Many of the iris commands could benefit from being made their own modules. I have created iris_qlist.py as another example. The value of using a custom module is that the output is in a dictionary that can easily be turned into variables for use in your Ansible scripts. For example:

- name: Execute IRIS qlist  
  iris_qlist:  
    instance_name: 'IRIS'  
  register: qlist_output  

- name: Debug qlist_output  
  ansible.builtin.debug:  
    var: qlist_output  

- name: Display qlist  
  ansible.builtin.debug:  
    msg: "qlist {{ qlist_output.fields }}"  

- name: Create variables from dictionary  
  ansible.builtin.set_fact:  
    "{{ item.key }}": "{{ item.value }}"  
  loop: "{{ lookup('dict', qlist_output.fields) }}"

And the output. Populates variables with IRIS details.

TASK [db_server : IRIS iris_qlist module demo] **************************************************************
ok: [dbserver1] => {
    "msg": "IRIS iris_qlist module demo"
}

TASK [db_server : Execute IRIS qlist] ***********************************************************************
ok: [dbserver1]

:
:
:

TASK [db_server : Create variables from dictionary] *********************************************************
ok: [dbserver1] => (item={'key': 'iris_qlist_instance_name', 'value': 'IRIS'})
ok: [dbserver1] => (item={'key': 'iris_qlist_instance_install_directory', 'value': '/iris'})
ok: [dbserver1] => (item={'key': 'iris_qlist_version_identifier', 'value': '2024.1.0.263.0'})
ok: [dbserver1] => (item={'key': 'iris_qlist_current_status_for_the_instance', 'value': 'running, since Thu May  9 06:50:55 2024'})
ok: [dbserver1] => (item={'key': 'iris_qlist_configuration_file_name_last_used', 'value': 'iris.cpf'})
ok: [dbserver1] => (item={'key': 'iris_qlist_SuperServer_port_number', 'value': '1972'})
ok: [dbserver1] => (item={'key': 'iris_qlist_WebServer_port_number', 'value': '0'})
ok: [dbserver1] => (item={'key': 'iris_qlist_JDBC_Gateway_port_number', 'value': '0'})
ok: [dbserver1] => (item={'key': 'iris_qlist_Instance_status', 'value': 'ok'})
ok: [dbserver1] => (item={'key': 'iris_qlist_Product_name_of_the_instance', 'value': 'IRISHealth'})
ok: [dbserver1] => (item={'key': 'iris_qlist_Mirror_member_type', 'value': ''})
ok: [dbserver1] => (item={'key': 'iris_qlist_Mirror_Status', 'value': ''})
ok: [dbserver1] => (item={'key': 'iris_qlist_Instance_data_directory', 'value': '/iris'})

I will expand the use of Ansible IRIS modules in the future and create more community posts as I progress.

Seisuke Nakahashi · Jul 23, 2024

Thank you so much @Murray Oldfield ! This article will help IRIS developers so much, especially who need to setup or upgrade IRIS instances for many servers at once for testing purpose. Recently such quick instance cycles (install/upgrade/down) have become common, so I hope this information will help many IRIS developers 😆

Your GitHub source (link) and Global Summit session video (link) help me, too!

0 0

Ansible modules and IRIS demo