https://www.youtube.com/embed/EYLHIEAxHLk [This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]
System Alerting and Monitoring (SAM) in InterSystems IRIS data platform helps you efficiently monitor and manage your systems. This video shows some solutions it offers for specific challenges faced by developers and operators:
InterSystems IRIS Business Intelligence provides the Cube Registry as an interface for managing and scheduling build and synchronize tasks for your cubes.
Previously, I shared with you all a handy operational analytics dashboard you can build to visualize key message processing metrics, such as number of inbound/outbound messages, average processing times, etc.
Whenever the Windows SNMP Service restarts, the snmpdbg log says the following.
13:08:59 :Attempting initial TCP connection(s) with 1 Cache instances ... 13:08:59 :Get connection with ENSEMBLE on port 1972 13:08:59 :Connection refused on port 1972, check if Cache instance ENSEMBLE is started. 13:08:59 :Cache iscsnmp.dll initialized for 1 configs
Ensemble and all productions are running. I've set up Caché SNMP agent on many other servers in our company and those are working fine. However this one server won't budge.
I created a task from Management portal Task manager to use the Ens.Util.Tasks.Purge task . Task set up includes email notification setup for Completion email and error email.
This task is giving an error and no email is generated:
As part of our continuous efforts to expand and improve the InterSystems IRIS Data Platform, we’ve set up a brief survey around SQL monitoring. Your feedback will help us in designing and developing the right tools for the job and improve the platform’s overall ease-of-use. Please use the link below to access the survey, which should only take around 5 minutes to complete.
I just watched the recording of Michael Brady's presentation on Ensemble Disk Free Space Monitoring. Is the sample code for the Task definition class still available? How can I obtain a copy?
https://www.youtube.com/embed/EYZ4dXNZNSY [This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]
https://www.youtube.com/embed/9yEm7ZAZENI [This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]
https://www.youtube.com/embed/xcHjcBTLw8o [This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]
https://www.youtube.com/embed/7ImJPCdp96A [This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]
GA releases are now available for the first version (v1.0) of InterSystems System Alerting and Monitoring (InterSystems SAM for short)
InterSystems SAM v1.0 provides a modern monitoring solution for InterSystems IRIS based products. It allows high-level views of clusters and single-node drilled down metrics-visualization together with alerts notifications. This first version provides visualization for more than one hundred InterSystems IRIS kernel metrics, and users can extend the default-supplied Grafana template to their liking.
With this article, I would like to show you how easily and dynamically System Alerting and Monitoring(or SAM for short) can be configured. The use case could be that of a fast and agile CI/CD provisioning pipeline where you want to run your unit-tests but also stress-tests and you would want to quickly be able to see if those tests are successful or how they are stressing the systems and your application (the InterSystems IRIS backend SAM API is extendable for your APM implementation).
Preview releases are now available for the first version (v1.0) of InterSystems System Alerting and Monitoring (InterSystems SAM for short).
InterSystems SAM v1.0 provides a modern monitoring solution for InterSystems IRIS-based products. It allows high-level views of clusters and single-node drilled down metrics-visualization together with alerts notifications. This first version provides visualization for more than one hundred InterSystems IRIS kernel metrics, and users can extend the default-supplied Grafana template to their liking.
V1.0 is meant to be a simple and intuitive baseline. Please help us make it great by trying it and sending us feedback!
SAM can display information from InterSystems-based instance starting with version 2019.4
SAM is only available in container format. You will need the SAM Manager container plus a small set of additional open-source components (Prometheus and Grafana) that are added automatically by the composition file.
SAM components and the SAM Manager Community Edition are available from
Externally at the SAM components Github repo & the SAM Manager on Docker Hub if you want to download it before the docker-compose runs (this last link might not be available for few hours but the container is pullable)
If you are traveling or prefer a voice-based Q&A description on what SAM is, here is a podcast we have prepared for you:
https://5e18edf067eb59-03854285.castos.com/player/198587 [This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]
One of the topics that comes up often when managing Ensemble productions is disk space:
The database (the CACHE.DAT file) grows in a rate that was unexpected; or the Journal files build up at a fast pace; or the database grows continuously though the system has a scheduled purge of the Ensemble runtime data.
It would have been better if these kind of phenomena would have been observed and accounted for yet at the development and testing stage rather than on a live system.
For this purpose I created a basic framework that could aid in this task.
A long time ago I enabled Activity Monitoring to be able to save myself headaches in the future when looking at the performance of various message routes through our productions. It's served it's purpose of answering questions on how many messages we process a week etc but I had not had the chance to really dig down into the stats for specific message types or destinations to pin point issues.
https://www.youtube.com/embed/XtXzvN3Gqgw [This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]
https://www.youtube.com/embed/4k9Qsc_HW7g [This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]
Off the back of the Interface Monitoring post I had created a class that queries the Ens.AlertRequest global and returns the entries between 6pm the night before and 6am in the morning.
I tested this build in our T&D environments and the build worked very well.
However in our production environment the query is being truncated, by what I believe to be a timeout and I get a partial query output.
In the System>SQL pages my 12 hour query times out.
APM normally focuses on the activity of the application but gathering information about system usage gives you important background information that helps understand and manage the performance of your application so I am including the IRIS History Monitor in this series.
In this article I will briefly describe how you start the IRIS or Caché History Monitor to build a record of the system level activity to go with the application activity and performance information you gather. I will also give examples of SQL to access the information.
We are constantly running into issues where there are billions of Orphaned messages in our system that cause problems, and we have to manually run a cleanup to fix performance issues.
This post is dedicated to the task of monitoring a Caché instance using SNMP. Some users of Caché are probably doing it already in some way or another. Monitoring via SNMP has been supported by the standard Caché package for a long time now, but not all the necessary parameters are available “out of the box”. For example, it would be nice to monitor the number of CSP sessions, get detailed information about the use of the license, particular KPI’s of the system being used and such. After reading this article, you will know how to add your parameters to Caché monitoring using SNMP.