#Monitoring

5 Followers · 179 Posts

Monitoring is a process of controlling and management of performance and availability of software applications.

Question Erin Dolson · Jul 16, 2019

Hi all,

I'm looking to set up monitoring for several interfaces. I understand that I can set an Inactivity Timeout. However, obviously there are messages coming through more frequently during certain hours than other hours. 

Is there a way to set an Inactivity Timeout for each hour of the day instead of one value that is used all day long? 

Best,

Erin

12
3 926
Question Glenn van Bavel · May 4, 2017

Whenever the Windows SNMP Service restarts, the snmpdbg log says the following. 

13:08:59 :Attempting initial TCP connection(s) with 1 Cache instances ...
13:08:59 :Get connection with ENSEMBLE on port 1972
13:08:59 :Connection refused on port 1972, check if Cache instance ENSEMBLE is started.
13:08:59 :Cache iscsnmp.dll initialized for 1 configs

Ensemble and all productions are running. I've set up Caché SNMP agent on many other servers in our company and those are working fine. However this one server won't budge. 

Does anyone have any idea what the problem may be here? 

Regards,

Glenn

10
0 1383
Question Michael Jobe · Jan 26, 2021

The current version of SAM creates Prometheus metric endpoints which appear to be handled correctly by the current prometheus scraper, however the metrics do not confirm to the current prometheus standard.  The standard states: 

  • Prometheus' text-based format is line oriented. Lines are separated by a line feed character (\n). The last line must end with a line feed character. Empty lines are ignored.

Link is here: Prometheus format

In the current output on the

<hostname>:8080/api/monitor/metrics

endpoint the last metric does not contain a newline causing parsers to fail.

9
0 515
Question Oliver Wilms · Aug 13, 2019

Hello,

I want to create a dashboard with a line graph that shows system availability over time. I used this code to create a Dashboard:

            Set tItem = ##class(%DeepSee.UserLibrary.Link).%New()
                Set tItem.fullName = "Availability"
                Set tPage = "Availability.UI.CSVImport.zen"
                Set tItem.href = $system.CSP.GetPortalApp($namespace,tPage)_tPage
                Set tItem.title = "Availability"
                Set tSC = tItem.%Save()
7
0 435
Question Mary George · Sep 4, 2020

Hi, 

I created a task from Management portal  Task manager to use the Ens.Util.Tasks.Purge task . Task set up includes email notification setup for Completion email and error email.

This task is giving an error  and no email is generated: 

<CLASS DOES NOT EXIST>zSendMail+22^%SYS.TaskSuper.1 *Security.SSLConfigs

I tested all other task types available from Ens.Util.task but all are giving the same error.

Not sure if this Is this a bug or some missing configuration in the task setup ? Anyone noticed any similar issue or any idea how to fix this ? 


Thank you for your help.

Regards,

Mary

7
0 780
Question Scott Roth · Oct 2, 2019

We are constantly running into issues where there are billions of Orphaned messages in our system that cause problems, and we have to manually run a cleanup to fix performance issues.

 In the following article about orphaned messages... https://community.intersystems.com/post/ensemble-orphaned-messages it mentions either programmatically eliminating the Orphaned messages or using a Utility like Demo.Util.CleanupSet in ENSDEMO.

7
2 1016
Question Tirthankar Bachhar · Nov 4, 2016

As per the documentation of QueueCountAlert:
Number of messages on this item's queue needed to trigger an Alert message to be sent. Note that no further alerts will be sent unless the number of messages on the queue drops below 80% of this number and then rises again to this number.
Note that this alert will be sent even if AlertOnError is False.
Zero means no alerts of this type will be sent.
Now, the question is,
If QueueCountAlert is set to 10, and the queue size become 11 we will be getting email once.

6
0 603
Question Alexey Maslov · Nov 28, 2017

In short, I wanted to react on CPUusage warnings and alerts with my own actions. It seemed that it was possible in my Caché version (2015.1): 
http://docs.intersystems.com/cache201513/csp/docbook/DocBook.UI.Page.cl…

But all my attempts silently failed. Callback code was as simple as possible: 

Class %z.Monitor.Health Extends SYS.Monitor.Health.AbstractCallback

I've got my alerts written to alerts.log and cconsole.

6
0 499
Question David.Satorres6134 · Sep 12, 2019

Hi all,

I recently discovered the Monitoring Activity Volume feature in IRIS and I was amazed by it. So, I put it to work in one of our productions. It is nice how easy it is to set up and all the possibilites that came with it.

But there's something weird: the numbers. Actually, one of the BP is stating a time of more than 6 seconds to process:

But it is not really possible, as our production is running at a pace of about 40 msg/second, being this one the first step. So my question is: how is this avg. duration calculated? What does this time include? Is it in seconds?

Thanks a lot,

6
1 461
Question Mack Altman · Jan 21, 2017

Can someone direct me to where in the documentation we can find how consumption may be calculated for global storage?

Caché Version 2010.1
Operating System HP OpenVMS 8.4

EDIT: After receiving some responses, it seems I was unclear in my initial inquiry. I am looking to determine our rate of consumption of storage; however, I am having some difficulty in doing that.

While utilizing ^%GSIZE, which is used by the %GlobalEdit class, the results appeared odd. I have provided my results below, which illustrate the global structure on the left and the usage indicated by ^%GSIZE on the right.

6
0 925
Question Mark OReilly · May 13, 2022

Hi:

Currently we are using an older Healthshare instance but I am not opposed to using IRIS as we will upgrade eventually. 

Currently for monitoring productions we have a Montior screen. We have both the Queues page and a Deepsee dashboard which has current status of our services. The issue with the Deepsee method we currently have with traffic lights is 1) the page is a bit slow to load the metrics 2) any new services from the team  a new widget needs created and although this is easy enough to do it just is time consuming.

5
0 532
Question Paster-Bachar Gadi · May 23, 2018

Hi All

I'm looking for the a simple-quick-easy solution to monitor a SQL table thought Ensemble.

I have a process that update a DataBase ,a scheduled task that runs every night  (Not Ensemble)

In the end it updates a table (replica_status) with a new recored with two fileds:  Id, DateTime

I looked around the community but didn't find an answerd case.

I'm thinking on a Task that will run a sql outboud adapter BO that checks that table and send a alert if no new record was created yesterday

is this the right approach or is there's a better solution?

Thanks Gadi

5
0 558
Question Martin Staudigel · Dec 2, 2021

Hello everybody,

after updating from 2018.2.1 to 2021.1 we observe a change in the behaviour of the Messagebank Enterprise Monitor.

In 2018.2.1, when clicking on a specific line inside the configured systems the system dashboard opened, giving insights about queue counts and error conditions.

In 2021.1. when doing the same thing, the login screen of the designated server instance shows up, but does not allow to login (any try, even with valid %All credentials results in a reload of the page). I even created a new user, remebering the password hash issue mentioned in https://docs.intersystems.

5
0 408
Question Fahima Ansari · Apr 1, 2024

In The Business Process and the Business Operation, I am using the following code to get the value of TimeCreated and TimeProcessed 

BP:

%Ensemble("Process").%PrimaryRequestHeader.TimeCreated

%Ensemble("Process").%PrimaryRequestHeader.TimeProcessed

BO:

..%RequestHeader.TimeCreated

..%RequestHeader.TimeProcessed

But I am trying to use ..%RequestHeader.TimeCreated in Business Service it is not storing any value.

How to get the value of TimeCreated and TimeProcessed in Business Service?

5
0 203
Question Stuart Byrne · Dec 17, 2019

Off the back of the Interface Monitoring post I had created a class that queries the Ens.AlertRequest global and returns the entries between 6pm the night before and 6am in the morning.   

I tested this build in our T&D environments and the build worked very well.

However in our production environment the query is being truncated, by what I believe to be a timeout and I get a partial query output.

In the System>SQL pages my 12 hour query times out.

I compared the Global size by running a SELECT MAX(ID) query and got a return of 60,244,962 records.

5
0 312
Question Murray Oldfield · Mar 19, 2021

SAM - Hacks and Tips for set up and adding metrics from non-IRIS targets

SAM (System Altering and Monitoring) comes with as a 'batteries included' docker-compose container set that is ready to start monitoring IRIS instances with a default dashboard as soon as it starts up. The initial configuration is good to understand SAM functionality and start basic monitoring of your IRIS systems. However, out of the box, there are some setting s that you will need to change when you start to monitor many systems and collect a lot of metric data.

5
0 828
Question Alexey Maslov · May 11, 2017

Since most of our customers moved to Caché 2015.1, some admins became abused with CPUPct warnings (sometimes alerts) in console log without other signs of lacking CPU power.
Documentation states that:

          CPUPct               job_type              CPU usage (percent) by all processes of the listed job type in aggregate       

What does it really mean?
E.g., if total system CPU usage is 25%, and all running processes are of the same type (e.g, CSPSRV), would CPUPct be equal to 100%? If so, why this case should be a reason for alert?

4
0 787
Question Han Ya · Sep 25, 2020

Whenever the Windows SNMP Service restarts, the snmpdbg log says the following. 

16:58:25 :Debug tracing enabled for SNMP agent
16:58:25 :SnmpExtensionInit called, pid=4432, tid=12276
16:58:25 :CreateEvent for CacheSNMPTrap suceeded
16:58:25 :register Cache OID 1.3.6.1.4.1.16563.1
16:58:25 :Get all Cache configs ... 16:58:25 :found 1 configs
16:58:25 :Add ENSEMBLE config to list ... 
16:58:25 :RegOpenKey for SOFTWARE\InterSystems\Cache\Configurations\ENSEMBLE\Properties

4
0 638
Question Alfredo Neto · Oct 13, 2022

Hello,

I am currently having the experience activating prometheus for iris db.

This environment that I speak uses IKO as a base.

I need to put 3 notes in the iris service area.

Are they:

annotations:
   prometheus.io/path: "/monitor/metrics"
   prometheus.io/port: "52772"
   prometheus.io/scrape: "true"

I'm not finding this possibility in the IKO documentation.

Has anyone had this experience and can help us with this challenge?

 Below is the current configuration we made, however, it did not create the annotations we need

apiVersion: intersystems.com/v1alpha1
kind: IrisCluster
metadata:
  name: iris-db-teste
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/path: "/api/monitor/metrics"
    prometheus.io/port: "52773"
spec:
  licenseKeySecret:
    name: licenca-iris
  configSource:
    name: iris-cpf
  topology:
    data:
      shards: 2
      mirrored: true
      image: CONTAINER_IMAGE
      podTemplate:
        spec:
          args:
            - --check-caps
            - "false"
      storageDB:
        resources:
          requests:
            storage: 10Gi
        storageClassName: iris-ssd-storageclass
  serviceTemplate:
    spec:
      type: ClusterIP
4
0 309
Question Guilherme Silva · Jan 3, 2018

I want to understand how this message is build:

[SYSTEM MONITOR] CPUusage Alert: CPUusage = 99, 99, 99 (Max value is 85).

Caché keep a log of cpu usage (99,99,99) and how is the frequency of check of this?

how can i chance the max value? is that possible?

Best,

4
0 1035
Question Piotr Stefańczyk · Oct 26, 2022

Hello

I have a problem on enabling SNMP monitoring on Cache.

I installed on HP UX NET SNMP 5.7.2 package from HP Software Center and enabled agentX protocol in snmpd.cfg. 

When I enabled full debugging on Cache and NET SNMP I discovered that sent and received packets on both sides are not the same. Some bytes are different. I think the problem is in default charset for TCP/IP connection which is on our system set to CP1250 instead of default RAW. So result is that Cache notifies are not visibile from snmpwalk etc.

Is there a solution for this issue? 

Peter

4
0 299
New
Question Luis Gallardo · May 6

Hi! 
We are working on containerizing our IRIS product. We want to extract the message log that is shown in the terminal, but if possible, we want to format the output as JSON and include some extra fields from the instance to enhance our monitoring. Is this possible?
Any guide or example about it?
Thanks!

4
0 45
Question Laura Blázquez García · Feb 8, 2017

Hi,

We want to monitor an Ensemble Production and send custom email alerts in function of some Rules. For example, if we normally receive 1 message per second, if suddenly we receive 5 or more messages per second, we want to send an email alert. And if tomorrow we don't want to check this again, we want to disable it through Ensemble Business Rules.

4
0 1215