Can someone tell me if intersystems-ru/deepsee-sysmon-dashboards is developed for a specific version of Ensemble? Looks like it could be useful to my group but we aren't upgrading till later this year and we are on 2015.2.2.
The current version of SAM creates Prometheus metric endpoints which appear to be handled correctly by the current prometheus scraper, however the metrics do not confirm to the current prometheus standard. The standard states:
My employer set up a web-based HL7 interface monitor dashboard that will display all Ensemble components (Service/Process/Operation) in a Production, their status, and the support information embedded in each interfaces listing on the Monitor. Please see 3 screenshots.
This is part of the URL that we go to when accessing this Web based Monitor: ......57772/csp/healthshare/monitor/Rush.Monitor.Web.Home.cls
My group needs to be able to monitor items / tasks, and let a non-management-portal user see the monitoring. Is it possible to run DeepSee queries on Production items? I feel like I should not be recreating the production environment or the task manager just so that I can query on the items that are running, and on their states (like "successful" or "send email").
Also, I need to log custom events for each task, and I'm running into difficulties with the task manager in this regard; hence the question about using the Production instead, but querying it.
I believe most of you have encounted this problem: a healthconnect/ensemble user get a slow response and ask measurement on how long it takes ensmeble to process this request, the ensemble 'activity data' gives no clue of the delay.
The reason is HealthConnect message measurement was based on ensemble message, which can’t give a correct answer on when ensmeble recevie the request and what time it send back response. when there is delay on inbound/outbound adpter, or csp gateway, there is no way to find out the delay from "activity data" .
In the Windows Ressource Manager I can observe multiple parallel processes coming from cache.exe with read operations to journaling files.
All except one of these processes have the same reads(Byte/s). The processes point to different journal files and constantly read between 200 and 3000 Bytes/s.
The corresponding process via PID in the management portal of Caché shows the process %SYS.Monitor.Control.1. In 3 days of uptime on the server it has run 181.632.583 commands and modified 32.140.642 globals.
Can someone direct me to where in the documentation we can find how consumption may be calculated for global storage?
Caché Version
2010.1
Operating System
HP OpenVMS 8.4
EDIT: After receiving some responses, it seems I was unclear in my initial inquiry. I am looking to determine our rate of consumption of storage; however, I am having some difficulty in doing that.
UPDATE: It turns out it was just me being a dummy, and the snmpd was correctly telling me there is no value associated with that exact key. I should have used snmpwalk instead of snmpget to display the whole tree.
Original Post follows:
Hello! I'm trying to set up SNMP monitoring on Caché, using documentation and this article
Our team is working on building dashboard for internal reference and monitoring.
We would like to have details like Interface Name, Current Status, Last Messages Processed at, IP & Port, Serve/Instance/Production Environment name etc.
If there is any built-in service which we can utilize or any pre-compiled code that we can utilize to build such dashboard.
At this moment want to keep it basic, but moving forward will enhance with more advance features.
We want to monitor an Ensemble Production and send custom email alerts in function of some Rules. For example, if we normally receive 1 message per second, if suddenly we receive 5 or more messages per second, we want to send an email alert. And if tomorrow we don't want to check this again, we want to disable it through Ensemble Business Rules.
I'm looking to set up monitoring for several interfaces. I understand that I can set an Inactivity Timeout. However, obviously there are messages coming through more frequently during certain hours than other hours.
Is there a way to set an Inactivity Timeout for each hour of the day instead of one value that is used all day long?
after updating from 2018.2.1 to 2021.1 we observe a change in the behaviour of the Messagebank Enterprise Monitor.
In 2018.2.1, when clicking on a specific line inside the configured systems the system dashboard opened, giving insights about queue counts and error conditions.
I know there's a whole chapter on the subject but I would love a super simple video demo or sample configuration or training course. The myriad menu of options and unfamiliar prompts can make it a bit daunting. The challenge is simple. Send an email notification if the license usage exceeds n% LU consumption. Why? A recent software change seemed to be responsible for causing the LU total consumption to reach 100%.
I've setup ODBC connection so I can access Cache data within SQL Server.
I want to be able to write SQL queries for internal monitoring purposes, similar to what's possible with SQL Server. Specifically I want to be able to check mirroring status (i.e. check which is the current primary mirror member), check the status of any Ensemble productions (started/stopped), check the status of business hosts etc. I want to do all of this from SQL Server to go with our other system monitoring solutions.
Whenever the Windows SNMP Service restarts, the snmpdbg log says the following.
13:08:59 :Attempting initial TCP connection(s) with 1 Cache instances ... 13:08:59 :Get connection with ENSEMBLE on port 1972 13:08:59 :Connection refused on port 1972, check if Cache instance ENSEMBLE is started. 13:08:59 :Cache iscsnmp.dll initialized for 1 configs
Ensemble and all productions are running. I've set up Caché SNMP agent on many other servers in our company and those are working fine. However this one server won't budge.
I'm a DBA and support Caché databases on AIX. I coded shell scripts for monitoring journaling status, databases size, license end date.
We recently got a new instance of Caché on Windows. I'm just curious to know whether anyone coded database monitoring scripts on Windows using PowerShell or any other scripting language.
Im trying configure the Caché Monitor Manager (^MONMGR) utility for send alert e-mails. Following the steps I have doubs to configure the options in "Set Server" to send e-mails for hotmail or outlook (smtp-mail.outlook.com). I dont know how can I configure Mail server SSLConfiguration for hotmail or outlook. Could you give me help? Thank you!
I hope you are all doing well. I am currently facing an issue while trying to set up the SNMP subagent functionality for my InterSystems Cache installation.
I am using InterSystems Cache for Windows (AMD64) version 5.2.4 (Build 809_0_9006U). The SNMP subagent functionality requires the iscsnmp.dll dynamic library, which I have been unable to locate in my installation directory.
Since most of our customers moved to Caché 2015.1, some admins became abused with CPUPct warnings (sometimes alerts) in console log without other signs of lacking CPU power. Documentation states that:
We are constantly running into issues where there are billions of Orphaned messages in our system that cause problems, and we have to manually run a cleanup to fix performance issues.
Currently we are using an older Healthshare instance but I am not opposed to using IRIS as we will upgrade eventually.
Currently for monitoring productions we have a Montior screen. We have both the Queues page and a Deepsee dashboard which has current status of our services. The issue with the Deepsee method we currently have with traffic lights is 1) the page is a bit slow to load the metrics 2) any new services from the team a new widget needs created and although this is easy enough to do it just is time consuming.
I have Ensemble/Healthshare running in a production environment which is setup with a mirror failover and an arbiter sitting between them.
In the event of a failover we have a number of connections that need stopping/monitoring and starting in a certain order.
Is there a programmatic way we can detect the failover and stop certain services and operations immediately and then start them up again in the required order, checking their connection state before starting the next connection.
Off the back of the Interface Monitoring post I had created a class that queries the Ens.AlertRequest global and returns the entries between 6pm the night before and 6am in the morning.
I tested this build in our T&D environments and the build worked very well.
However in our production environment the query is being truncated, by what I believe to be a timeout and I get a partial query output.
In the System>SQL pages my 12 hour query times out.