Monitoring

Syndicate content 16 

There have been some very helpful articles in the community that show how to use Grafana with IRIS (or Cache/Ensemble) by using an intermediate database.

But I wanted to get at IRIS structures directly. In particular, i wanted to access the Cache History monitor data that is accessible by SQL as described here

https://community.intersystems.com/post/apm-using-cach%C3%A9-history-mon...

and didn't want anything between me and the data.

I already had class queries that returned the data i wanted, so i just needed to embed them in an REST class that returned JSON. I haven't included my class Grafana.MonitorData because it could be anything but i can if people want it

Last comment 26 February 2019
+ 2   0 3
101

views

+ 2

rating

Hey Developer Community,

I wanted to reach out and ask the group who is still using the Polymetric Dashboard? You can read about it in this older Community article.

It was created as a sample that is designed to be customized by you, so it is a community supported tool vs. an official InterSystems supported tool.  I've noticed that it hasn't been updated in a while and some of the dependencies have some potential security vulnerabilities.  If you are using the tool internally, it isn't an issue, but was wondering if anybody who is using it has already updated their code with newer dependencies, etc?

So, is there interest from the community in updating it?  Or testing it with IRIS?

- Doug

0   0 2
0

answers

0

comments

95

views

0

rating

Can someone tell me if /deepsee-sysmon-dashboards is developed for a specific version of Ensemble? Looks like it could be useful to my group but we aren't upgrading till later this year and we are on 2015.2.2.

Thanks

Scott

Last answer 7 January 2019
0   2 1
0

comments

74

views

0

rating

In looking at the Production monitor within Ensemble, I was wondering if there is a way we could customize it for our use. I notice it is basically a dashboard.

For example I would only like to truly display those Services, Processes, and Operations that are truly in dire need of attention. The Monitor out of the box just seems too busy, and I would like to simplify it.

I was trying to find a sample how a Monitor Dashboard would be setup, but I am not seeing anything in ENSDEMO, or SAMPLES. Has anyone created a Custom Dashboard/Monitor for their purposes? Would you be willing to share an example with me?

Thanks

Scott Roth

The Ohio State University Wexner Medical Center

Last comment 3 December 2018
0   2 4
0

answers

186

views

0

rating

Hi

Totally new to IRIS and Cache.

Trying to evaluate it and work out how we could use it.

As a standard application database. Object or relational etc. does not matter. 

Issue is ObjectScript.

So:

1) Can we develop, maintain and use an IRIS database and never use ObjectScript i.e. use only Java, Python, C++ interfaces etc. (exactly which one does not matter)? Would that make designing and using the IRIS database more prone to inefficiency and error?

2) Can we import an existing Cache database into IRIS and convert its ObjectScript code into Java, Python whatever? Is that a big, difficult, error-prone job?

If the answers are no that may not be a showstopper but would like to know it now. 

A lot of training will be involved in any case I know and Oracle has PL/SQL but ObjectScript developers are rare.

Apologies if the answers are in the doco. Have read some of it but need some indication about the above urgently.

Last answer 14 November 2018 Last comment 14 November 2018
+ 1   0 2
237

views

+ 1

rating

Hi,

We want to monitor an Ensemble Production and send custom email alerts in function of some Rules. For example, if we normally receive 1 message per second, if suddenly we receive 5 or more messages per second, we want to send an email alert. And if tomorrow we don't want to check this again, we want to disable it through Ensemble Business Rules.

Last answer 8 February 2017 Last comment 30 May 2018
0   0 2
490

views

0

rating

Hi All

I'm looking for the a simple-quick-easy solution to monitor a SQL table thought Ensemble.

I have a process that update a DataBase ,a scheduled task that runs every night  (Not Ensemble)

In the end it updates a table (replica_status) with a new recored with two fileds:  Id, DateTime

I looked around the community but didn't find an answerd case.

I'm thinking on a Task that will run a sql outboud adapter BO that checks that table and send a alert if no new record was created yesterday

is this the right approach or is there's a better solution?

Thanks Gadi

Last answer 24 May 2018 Last comment 29 May 2018
0   0 3
169

views

0

rating

Hi community,

I need to monitor Caché Intersystems with some custom indicators.


I started customizing the SNMP Mib. But I've been in a Zabbix event, all speakers use ODBC to monitor their database, Oracle, MySQL, PostgreSQL ...

What is the best way? Use ODBC or SNMP Custom Mib?
What are you guys using?

Last answer 28 March 2018 Last comment 26 February 2018
0   0 3
304

views

0

rating

This post is dedicated to the task of monitoring a Caché instance using SNMP. Some users of Caché are probably doing it already in some way or another. Monitoring via SNMP has been supported by the standard Caché package for a long time now, but not all the necessary parameters are available “out of the box”. For example, it would be nice to monitor the number of CSP sessions, get detailed information about the use of the license, particular KPI’s of the system being used and such. After reading this article, you will know how to add your parameters to Caché monitoring using SNMP.

What we already have

Last comment 22 March 2018
+ 9   0 4
4922

views

+ 9

rating

I wanted to see some alerts that occur in my Productions in a Mobile Device, I came across Pushover.net recently that although has an upfront cost  $5 you can send as many messages as you like after that, there is a 7 day free trial to check it out.

To Integrate this with a production I did the following.

Create an account and set up a device on https://pushover.net/

Record the following API Keys from the web site on the main page you will see

Your User Key   example: ueh3t7478foi3ruf2ogb3syu4fs34s

You then need to create an application I set

Name : Cache

Type : Application

And left the oter settings blank.

This will give you a 

API Token/Key  example: auh000es1aaaa7ddeb3i4jfkgwswero

I then created this class to add an operation to the production, replace your Key, token and device name int

Last comment 16 March 2018
+ 3   0 5
537

views

+ 3

rating

APM normally focuses on the activity of the application but gathering information about system usage gives you important background information that helps understand and manage the performance of your application so I am including the Caché History Monitor in this series.

In this article I will briefly describe how you start the Caché History Monitor to build a record of the system level activity to go with the application activity and performance information you gather. I will also give examples of SQL to access the information.

What is the Caché History Monitor?

The Caché History Monitor is an extension of the Caché System Monitor. It keeps a persistent record of the metrics relating to the database activity (e.g. global reads and updates) and system usage (e.g. CPU usage)

Last comment 18 January 2018
+ 2   0 2
420

views

+ 2

rating

practical guide to using the tools PERFMON and MONLBL.

Introduction

When investigating performance problems, I often use the utilities ^PERFMON and ^%SYS.MONLBL to identify exactly where in the application pieces of code are taking a long time to execute. In this short paper I will describe an approach that first uses ^PERFMON to identify the busiest routines and then uses ^%SYS.MONLBL to analyze those routines in detail to show which lines are the most expensive.

The details of ^PERFMON and ^%SYS.MONLBL are individually described in detail in the Caché documentation, but here I summarize when, why and how you can use them together to find the cause of a performance problem in an application

Last comment 29 December 2017
+ 4   0 11
548

views

+ 4

rating

In short, I wanted to react on CPUusage warnings and alerts with my own actions. It seemed that it was possible in my Caché version (2015.1): 
http://docs.intersystems.com/cache201513/csp/docbook/DocBook.UI.Page.cls...

But all my attempts silently failed. Callback code was as simple as possible:

Last answer 29 November 2017 Last comment 30 November 2017
0   0 2
171

views

0

rating

Application Performance Monitoring

Tools in InterSystems technology

Back in August in preparation for Global Summit I published a brief explanation of Application Performance Management (APM). To follow up on that I have written and will be publishing over the coming weeks a series of articles on APM.

One major element of APM is the construction of a historic record of application activity, performance and resource usage. Crucially for APM the measurement starts with the application and what users are doing with the application.  By relating everything to business activity you can focus on improving the level of service provided to users and value to the line business that is ultimately paying for the application

Last comment 21 November 2017
+ 4   0 1
254

views

+ 4

rating

Using the CSP Page Statistics

Application Performance Management

Introduction

A key part of Application Performance Management (APM) is recording the activity and performance of user activity. For many web applications the closest you can get to this is to record the CSP pages or CSP based services being dispatched.

If the pages or service names are meaningful and they indicate the business activity being performed the CSP page statistics can be very useful in building up a historical record of activity, performance and resource usage. This allows you to recognize trends in activity for planning purposes. It allows you to see gradually degrading performance or increasing resource usage so you can take action before you have a crisis. Also, if you do have a performance crisis you can look back to see what has changed since the performance was fine

Last comment 11 November 2017
+ 5   0 2
313

views

+ 5

rating

I have Ensemble/Healthshare running in a production environment which is setup with a mirror failover and an arbiter sitting between them.

In the event of a failover we have a number of connections that need stopping/monitoring and starting in a certain order.

Is there a programmatic way we can detect the failover and stop certain services and operations immediately and then start them up again in the required order, checking their connection state before starting the next connection.

I am thinking Ens.Director is probably what I need however I need some guidance on how to implement a solution.

 

 

Last comment 10 November 2017
0   0 2
0

answers

184

views

0

rating

Hello! This article continues the article "Making Prometheus Monitoring for InterSystems Caché". We will take a look at one way of visualizing the results of the work of the ^mgstat tool. This tool provides the statistics of Caché performance, and specifically the number of calls for globals and routines (local and over ECP), the length of the write daemon’s queue, the number of blocks saved to the disk and read from it, amount of ECP traffic and more.

Last comment 3 November 2017
+ 13   0 9
1301

views

+ 13

rating

Hello, 

I would like to implement the Activity Monitor in a Sharepoint page. 
How is it possible to integrate only the Zen element?

Is it necessary to develop a CSP application in which this element exists?

Has anyone done this before and can I get a tip?

With kind regards
Armin

Last answer 27 September 2017 Last comment 30 October 2017
0   0 2
225

views

0

rating

Caché Version String: Cache for UNIX (Red Hat Enterprise Linux for x86-64) 2016.2.1

 

We have a mirrored Ensemble system (110,  backup and 210, primary). At one time (14:00) there is a disruption in the production. The messages are not being processed. 

Looking at the pButtons (every 10 seconds) I see the following abnormal at the WDphase

Last answer 8 September 2017 Last comment 29 August 2017
0   0 3
283

views

0

rating

Prometheus is one of the monitoring systems adapted for collecting time series data.

Its installation and initial configuration are relatively easy. The system has a built-in graphic subsystem called PromDash for visualizing data, but developers recommend using a free third-party product called Grafana. Prometheus can monitor a lot of things (hardware, containers, various DBMS's), but in this article, I would like to take a look at the monitoring of a Caché instance (to be exact, it will be an Ensemble instance, but the metrics will be from Caché). If you are interested – read along

Last comment 31 August 2017
+ 13   1 6
1372

views

+ 13

rating

Is it possible to dynamically adjust the RetryInterval andFailureTimeout settings in a BPL?

I've got a business process that calls a web service operation to get a session ID from an external system.  There is a string property returned in the body of the response that indicate an exception occurred in the external system. I have code in the BPL that examines the property and sets the status property to an error status when that occurs.

Depending on what the value is I want to adjust the RetryInterval and FailureTimeout values used in by the system when the ReplyCodeActions is set to E=RD.

Example: If the string is "No sessions currently available" I want to retry every 15 seconds for 10 minutes before disabling the process and raising an alert.

If instead the string is "System is down for nightly maintenance"  I want to retry every 5 minutes for 4 hours before disabling the process and raising an alert.

Suggestions?

Last answer 30 August 2017 Last comment 31 August 2017
0   0 1
247

views

0

rating

Please excuse my ignorance. I am trying to identify what areas would be best to review in the System Dashboard (for Cache 2010.2) for performance issues with the database. It seems to be running slower than usual, but I am trying to find out the best way to go about identifying what the issue is.

The following are captures from the System Dashboard.

As always, thanks a lot for your help.

System Dashboard

Global and Routine Statistics

ECP Statistics

Last comment 15 August 2017
+ 2   0 5
0

answers

269

views

+ 2

rating

Hi
I would like to follow any change made in the production IE -
date,time, user,production,service/process/operation - name, item changed, value before change, value after change.
Is there a way I can get this data & trap the act of change in order to log this data.
Thanks Simcha

Last answer 20 July 2017 Last comment 17 July 2017
0   0 1
160

views

0

rating

Hi, Community!

Please find the Developer Community video of the week:  

Monitoring : Don't Turn a Drama into a Crisis 

 

0   0 1
0

comments

121

views

0

rating

Since most of our customers moved to Caché 2015.1, some admins became abused with CPUPct warnings (sometimes alerts) in console log without other signs of lacking CPU power.
Documentation states that:

          CPUPct              job_type             CPU usage (percent) by all processes of the listed job type in aggregate       

What does it really mean?
E.g., if total system CPU usage is 25%, and all running processes are of the same type (e.g, CSPSRV), would CPUPct be equal to 100%? If so, why this case should be a reason for alert? 

Checking the latest docs showed that the description of CPUPct was excluded from it, while the object still exists (at least in 2017.1.0)

Last answer 23 May 2017 Last comment 24 May 2017
0   0 2
282

views

0

rating