Written by

Gozynta

Question Ethan Calloway · 11 hr ago

Best way to monitor InterSystems IRIS performance in production

#InterSystems IRIS #Monitoring #Performance

I’m looking for practical recommendations to monitor InterSystems IRIS performance in a production environment. What are the most useful built-in tools, metrics, or best practices to track system health, database performance, and possible bottlenecks?

I would also like to know if there are recommended dashboards, logs, or monitoring integrations commonly used by the community.

Discussion (1)0

Add reply

Comments

DC AI Bot · 11 hr ago Bot

A practical production monitoring approach is to combine built-in portal monitoring, REST-exposed metrics, and production-specific monitoring if you use interoperability. [1][2]

For built-in tools, the Management Portal provides system usage and performance monitoring, including the System Usage page under System Operation > System Usage. The documented statistics there include:

Global references (all) and Global update references to track overall database activity
Routine calls
Routine buffer loads and saves, where in a well-tuned environment this should increase slowly because most loads are satisfied by routine cache memory
Logical block requests
Block reads and Block writes for physical database I/O
WIJ writes
Journal entries and Journal block writes [1]

Those metrics are useful for spotting bottlenecks in database access, cache effectiveness, disk I/O pressure, and journaling activity. [1]

The portal documentation also identifies three categories of performance information to monitor:

System Usage Table
Shared Memory Heap Usage
Interoperability Usage [1]

If you want dashboard-style monitoring and integration with external monitoring systems, InterSystems IRIS exposes metrics through a REST /metrics endpoint. The returned metrics include, for example:

write daemon timing such as iris_wd_write_time
write image journal timing such as iris_wij_write_time
Work Queue Manager metrics such as:
- iris_wqm_active_worker_jobs
- iris_wqm_commands_per_sec
- iris_wqm_globals_per_sec
- iris_wqm_max_active_worker_jobs
- iris_wqm_max_work_queue_depth
- iris_wqm_waiting_worker_jobs [2]

These REST metrics are appropriate when you want dashboards and monitoring integrations rather than relying only on the Management Portal. [2]

If you run interoperability productions, there is a dedicated monitoring path. The production state itself can be monitored to check whether productions are running and whether they are in bad states, using the Production Monitor. [3]

For interoperability metrics through REST, metric collection for productions is disabled by default. To enable monitoring for a production namespace, the documented steps include switching to the namespace and running:

set $namespace = "[interopNS]"
do ##class(Ens.Util.Statistics).EnableSAMForNamespace()

[2]

To gather more detailed production metrics, you can also:

enable activity monitoring with Ens.Util.Statistics.EnableStatsForProduction
add Ens.Activity.Operation.Local to the production
enable Provide Metrics for SAM for HTTP or SOAP outbound operations if you want HTTP transmission metrics [2]

Note:

The interoperability production data collected by the /metrics endpoint described in this section is very granular, providing detailed information about number of messages processed, average number of characters processed, etc. Users who wish to do a broader check-in about the statuses of their production interfaces can take advantage of REST APIs for Interoperability Usage Metrics. [2]

Note:

If the recording of metrics is enabled for a namespace but the corresponding production is not active, the /metrics endpoint does not return any metrics. [2]

Note:

Currently, HTTP transmission metrics are only collected for business operations which invoke actors using the Queue style (not inProc). For more information on the difference between these invocation styles, see Defining a Business Operation Class. [2]

For community-recommended practice, one cited recommendation is to set up an automated 24-hour system performance report using ^SystemPerformance, and then generate performance charts using YASPE for insight into how the system is handling load. Another community recommendation in the same discussion is to enable the History Monitor, which was described there as providing almost everything ^SystemPerformance gives, with less effort. That same discussion also mentions using INFORMATION_SCHEMA.STATEMENT_DAILY_STATS for SQL performance statistics and checking interoperability activity and volume statistics for productions. [4]

So, a practical production recommendation is:

Use the Management Portal regularly for core system usage, shared memory, and interoperability usage views. [1]
Track key low-level indicators such as block reads/writes, logical block requests, WIJ writes, journal activity, and routine buffer loads/saves to identify I/O, cache, or journaling pressure. [1]
Use the REST /metrics endpoint for external dashboards and monitoring integrations. [2]
If you use interoperability, enable production-specific metrics and use the Production Monitor for status and bad-state monitoring. [3][2]
For longer-term trend analysis, consider ^SystemPerformance with YASPE, and the History Monitor, as recommended in the community discussion. [4]