Question
· Mar 19, 2021

SAM - Hacks and Tips for set up and adding metrics from non-IRIS targets

SAM - Hacks and Tips for set up and adding metrics from non-IRIS targets

SAM (System Altering and Monitoring) comes with as a 'batteries included' docker-compose container set that is ready to start monitoring IRIS instances with a default dashboard as soon as it starts up. The initial configuration is good to understand SAM functionality and start basic monitoring of your IRIS systems. However, out of the box, there are some setting s that you will need to change when you start to monitor many systems and collect a lot of metric data. To get more value out of SAM, you will also want to add metrics from additional data sources (targets). The following tips will assist you on the path to deploying SAM in a production environment and collecting metrics from multiple targets and combining them in your own dashboards and charts. Also, you will see some of the commands might be useful as you explore the SAM containers and applications.

Caveat emptor: I should point out that some of these hacks and tips are probably not best practice; it is more a log of how I configured SAM the first time to monitor a benchmark with multiple servers and non-IRIS targets on the same systems. If you have suggestions, please educate me in the comments ;) So, remembering that this post may change over time, let's start;


In this tips below there are restarts of docker and starts and stops of SAM. Read through the tips, decide which ones apply to you, then do them in the same order as below.

1. Ensure you have enough space for SAM databases

By default docker containers store files in root (/) filesystem. SAM does not need much CPU or memory resources; however, metrics collection will take up space.

How much storage required for metrics 'depends'. Although it may not be clear how big your databases will be before you start monitoring, on the back of an envelope; monitoring 10 VMs for IRIS and operating system metrics on a 15-second scrape cycle consumed approximately 50GB of storage.

Strategies include; increasing the root storage of your monitoring instance, or changing the volume location for the databases. I used the following commands to change the docker directory on my VM running docker. Perhaps cracking a nut with a sledgehammer, but it works.

  • Stop docker and copy the docker files to a filesystem with plenty of space (in this case /data/docker/data). See the following example:
[root@mysamserver lib]# sudo systemctl stop docker  
[root@mysamserver lib]# pwd  
/var/lib  
[root@mysamserver lib]# cp -rp docker /data/docker/data  
[root@mysamserver lib]#  
[root@mysamserver lib]# rm -rf docker
  • Update the volume path in the docker configuration file. Note this file also has a network setting 'bip' (see note:...)
cat /etc/docker/daemon.json

    {
            "data-root": "/data/docker/data",
            "bip": "192.168.0.1/24"
    }
  • Restart docker
sudo systemctl daemon-reload
sudo systemctl restart docker

systemctl status docker.service

2. Set up SAM

I assume you have set up SAM on a test system, and a familiar with its basic operations; adding clusters and instances, and viewing system metrics. I suggest you take 20 minutes to view my Virtual Global Summit 2020 presentation for an overview of the install steps, and also how SAM looks when operating metrics from multiple targets have been added. To view the session use the following links (you will need to register with your email :

DEV007 System Alerting & Monitoring

Log on to the SAM portal and configure some IRIS instances. This populates the configuration files and give you a guide.

http://mysamserver:8080/api/sam/app/index.csp#/

Note: If you have many instances to add, or you wish to script this step, it is possible to add instances via an API. See the documentation.

3. Upgrade to a production licence

Out of the box SAM ships with an IRIS Community Edition license. There are several limitations, including an IRIS.DAT is limited to 10GB. 10GB is not big enough to collect data from many targets over a long period of time. Follow up with your InterSystems contacts for a production licence. Updating the licence in a stripped down container without an editor can be tricky, I simply logged into an interactive session on the IRIS container and updated the licence key using the following commands;

  • open a shell and change directory to the mgr folder (the default location of the iris.key file)
docker exec -it sam_iris_1 bash
cd /dur/iconfig/mgr
  • Update the key with a unix 'here document'. After the '>' paste the key text. After the the key text, commit the command by typing '>EOF'. Then 'exit' the shell.
cat <<EOF >iris.key
>
[ConfigFile]
FileType=InterSystems License Rev-A.1
LicenseID=999999

[License]
LicenseCapacity=InterSystems IRIS 2020.2 Server for SAM:etc etc, the key you were sent by your InterSystems contact.
>EOF
exit
  • Then stop and start SAM using the supplied docker-compose shell scripts;
./stop.sh
./start.sh
  • You can check everything is OK with the licence by accessing the System management portal, or by logging into the iris instance and checking the messages.log.
docker exec -it sam_iris_1 bash
cd /dur/iconfig/mgr
cat messages.log

4. Install additional prometheus exporters on targets

For example; The prometheus Node Exporter exposes a wide variety of hardware- and kernel-related metrics.

Node exporter documentation

Test that node exporter is working by requesting (scraping) the instance endpoint for metrics;

curl my_target_server_name:9100/metrics

You should see something like:

mylaptop:~ mo$ my_target_server_name:9100/metrics | more
HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 4.8862e-05
go_gc_duration_seconds{quantile="0.25"} 7.5898e-05
go_gc_duration_seconds{quantile="0.5"} 9.2974e-05
go_gc_duration_seconds{quantile="0.75"} 0.000130664
go_gc_duration_seconds{quantile="1"} 0.000358762
go_gc_duration_seconds_sum 303.291715258
go_gc_duration_seconds_count 2.572586e+06
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 9
:
: many many metrics will be displayed

Note you can do the same with your IRIS instances:

mylaptop:~ mo$ curl my_target_server_name:52776/api/monitor/metrics | more
iris_cpu_pct{id="AUXWD"} 0
iris_cpu_pct{id="CSPDMN"} 0
iris_cpu_pct{id="CSPSRV"} 0
iris_cpu_pct{id="ECPCliR"} 0
iris_cpu_pct{id="ECPCliW"} 0
iris_cpu_pct{id="ECPSrvR"} 0
iris_cpu_pct{id="ECPSrvW"} 0
:
: many many metrics will be displayed

5. Edit configuration files to add scrape of new targets

For example, the node-exporter instance from the previous tip. Configuration files will be where you installed SAM. As shown below you can see grafana and prometheus yml configuration files.

[root@mysamserver sam-1.0.0.115-unix]# ls
config  docker-compose.yml  readme.txt  start.sh  stop.sh
[root@mysamserver sam-1.0.0.115-unix]# tree -x config
config
├── alertmanager
│   └── isc_alertmanager.yml
├── grafana
│   ├── dashboard.json
│   ├── dashboard-provider.yml
│   ├── datasource.yml
│   └── grafana.ini
├── nginx
│   └── nginx.conf
└── prometheus
    ├── isc_alert_rules.yml
    └── isc_prometheus.yml

4 directories, 8 files

5.1 Add targets to prometheus

The following example an isc_prometheus.yml file created using the configuration GUI screens in SAM. The file shows two clusters. One cluster is monitoring the sam instance itself, the other cluster is monitoring five IRIS instances.

alerting:
  alertmanagers:
  - static_configs:
    - targets:
      - alertmanager:9093
global:
  evaluation_interval: 15s
  scrape_interval: 15s
remote_read:
- url: http://iris:52773/api/sam/private/db/read
remote_write:
- url: http://iris:52773/api/sam/private/db/write
rule_files:
- ./isc_alert_rules.yml
scrape_configs:
- job_name: SAM
  metrics_path: /api/monitor/metrics
  scheme: http
  static_configs:
  - labels:
      cluster: "1"
    targets:
    - mysaminstance.mycompany.com:8080
  - labels:
      cluster: "2"
    targets:
    - myiristarget1:52776
    - myiristarget2:52776
    - myiristarget3:52776
    - myiristarget4:52776
    - myiristarget5:52776
  • To add scarping of additional targets running node-exporter the following is added to the bottom of the isc_prometheus.yml file. Note the API metrics_path is different to IRIS.
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      - alertmanager:9093
global:
  evaluation_interval: 15s
  scrape_interval: 15s
remote_read:
- url: http://iris:52773/api/sam/private/db/read
remote_write:
- url: http://iris:52773/api/sam/private/db/write
rule_files:
- ./isc_alert_rules.yml
scrape_configs:
- job_name: SAM
  metrics_path: /api/monitor/metrics
  scheme: http
  static_configs:
  - labels:
      cluster: "1"
    targets:
    - iscsydsam.iscinternal.com:8080
  - labels:
      cluster: "2"
    targets:
    - myiristarget1:52776
    - myiristarget2:52776
    - myiristarget3:52776
    - myiristarget4:52776
    - myiristarget5:52776
- job_name: node_shard1
  metrics_path: /metrics
  scheme: http
  static_configs:
  - labels:
      cluster: "2"
      group: node
    targets:
    - myiristarget1:9100
- job_name: node_shard2
  metrics_path: /metrics
  scheme: http
  static_configs:
  - labels:
      cluster: "2"
      group: node
    targets:
    - myiristarget2:9100
- job_name: node_shard3
  metrics_path: /metrics
  scheme: http
  static_configs:
  - labels:
      cluster: "2"
      group: node
    targets:
    - myiristarget3:9100
- job_name: node_shard4
  metrics_path: /metrics
  scheme: http
  static_configs:
  - labels:
      cluster: "2"
      group: node
    targets:
    - myiristarget4:9100
- job_name: node_shard5
  metrics_path: /metrics
  scheme: http
  static_configs:
  - labels:
      cluster: "2"
      group: node
    targets:
    - myiristarget5:9100
  • Then stop and start SAM using the supplied docker-compose shell scripts;
./stop.sh
./start.sh

SAM is now collecting metrics from the IRIS instances you added through the GUI and from node-exporter on the same instances.

6. Increase the number of days prometheus collects metrics

In the first-release of SAM you can change the number of days to collect metrics in the GUI. However, I had an issue displaying all the metrics. Until I figure whats happening I changed the retention days in Prometheus, otherwise SAM will collect data; but you will not see metrics in Prometheus queries in Grafana.

In the file the docker-compose.yml file at the level where you installed SAM change the retention days which is set when prometheus starts, for example in the prometheus stanza update the storage.tsdb.retention.time parameter to match the retention days you want:

prometheus:
    command:
      - --web.enable-lifecycle
      - --config.file=/config/isc_prometheus.yml
      - --storage.tsdb.retention.time=30d

Note: In version 1 of SAM the maximum region days is 30.
Then stop and start SAM using the supplied docker-compose shell scripts;

./stop.sh
./start.sh

7. Create your own dashboards

You can add panels to existing dashboards, or create new dashboards to suit your monitoring needs. This is a big subject, so I will leave that for the next post. However, to help you on your way.

To switch in to Grafana use the View in Grafana button on the SAM screen.

DraggedImage.png

Once in Grafana you can create or edit dashboards;

DraggedImage-1.png

There are many examples on the web for querying exporters such as node-exporter.

The real power comes when you can display your IRIS system metrics (the defaults in SAM), your IRIS application metrics (you need to build these into your applications), and other metrics such as node-exporter or any number of others created by vendors. e.g. Monitor Docker containers using SAM and cAdvisor

Product version: IRIS 2020.4
Discussion (5)1
Log in or sign up to continue

I am on Linux

Linux version 3.10.0-1127.19.1.el7.x86_64 (mockbuild@x86-vm-28.build.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-39) (GCC) ) #1 SMP Tue Aug 11 19:12:04 EDT 2020
 

 ls -al /etc/docker
total 20
drwxr-xr-x.   3 root root   64 Nov 11 06:33 .
drwxr-xr-x. 112 root root 8192 Nov 30 12:46 ..
drwxr-xr-x.   5 root root   79 Nov 11 06:33 certs.d
-rwxrwxrwx    1 root root    4 Oct 14 15:22 daemon.json.rpmsave
-rw-------    1 root root  244 Sep 22 09:53 key.json

Second question

I get this error when trying to add SAM instance: 

Thanks

Linux version 3.10.0-1127.19.1.el7.x86_64 (mockbuild@x86-vm-28.build.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-39) (GCC) ) #1 SMP Tue Aug 11 19:12:04 EDT 2020
 

ls -al /etc/docker
total 20
drwxr-xr-x.   3 root root   64 Nov 11 06:33 .
drwxr-xr-x. 112 root root 8192 Nov 30 12:46 ..
drwxr-xr-x.   5 root root   79 Nov 11 06:33 certs.d
-rwxrwxrwx    1 root root    4 Oct 14 15:22 daemon.json.rpmsave
-rw-------    1 root root  244 Sep 22 09:53 key.json
 

FYI

CONTAINER ID   IMAGE                              COMMAND                  CREATED        STATUS                          PORTS                                                  NAMES
79e219366d65   nginx:1.17.9-alpine                "nginx -g 'daemon of…"   41 hours ago   Up 41 hours                     80/tcp, 0.0.0.0:8080->8080/tcp                         sam_nginx_1
7a457c8cbf61   grafana/grafana:6.7.1              "/run.sh"                41 hours ago   Up 41 hours                     3000/tcp                                               sam_grafana_1
871d1b2766d0   prom/alertmanager:v0.20.0          "/bin/alertmanager -…"   41 hours ago   Up 41 hours                     9093/tcp                                               sam_alertmanager_1
c00e55973c27   prom/prometheus:v2.17.1            "/bin/prometheus --w…"   41 hours ago   Restarting (1) 56 seconds ago                                                          sam_prometheus_1
c3f0255fc1a8   store/intersystems/sam:1.0.0.115   "/iris-main"             41 hours ago   Up 41 hours (healthy)           2188/tcp, 51773/tcp, 52773/tcp, 53773/tcp, 54773/tcp   sam_iris_1