Article
· Aug 8 11m read

Tracing InterSystems IRIS Applications Using Jaeger

This article outlines the process of utilizing the renowned Jaeger solution for tracing InterSystems IRIS applications. Jaeger is an open-source product for tracking and identifying issues, especially in distributed and microservices environments. This tracing backend that emerged at Uber in 2015 was inspired by Google's Dapper and Twitter's OpenZipkin. It later joined the Cloud Native Computing Foundation (CNCF) as an incubating project in 2017, achieving graduated status in 2019. This guide will demonstrate how to operate the containerized Jaeger solution integrated with IRIS.

 

Jaeger Features

  1. Monitor transaction flows executed by one or more applications and components (services) in conventional or distributed environments:
  2. Pinpoint performance bottlenecks in business flows, including distributed ones:
  3. Analyze and optimize dependencies between services, components, classes, and methods:
  4. Identify performance metrics to discover opportunities for improvement:

Jaeger Components



From Jaeger documentation: https://www.jaegertracing.io/docs/1.23/architecture/

  1. Client: Solutions, applications, or technologies (such as IRIS) that send monitoring data to Jaeger.
  2. Agent: A network daemon that listens for spans sent over UDP, batches them, and forwards them to the collector. It is designed for deployment on all hosts as an infrastructure component, abstracting the routing and discovery of collectors from the client.
  3. Collector: Receives traces from Jaeger agents and runs them through a processing pipeline, which currently validates traces, indexes them, performs any necessary transformations, and finally stores them.
  4. Ingester (optional): If a message queue like Apache Kafka buffers data between the collector and storage, the Jaeger ingester reads from Kafka and writes to the storage.
  5. Query: A service that retrieves traces from storage and hosts a UI to display them.
  6. UI: The Jaeger Web Interface for analyzing and tracing transaction flows.

Overview of Open Telemetry (OTel)

Since OpenTelemetry is the technology exploited by IRIS to send tracing data to Jaeger, it is important to understand how it works.
OpenTelemetry (aka OTel) is a vendor-neutral, open-source observability framework for instrumenting, generating, collecting, and exporting such telemetry data as traces, metrics, and logs. For this article, we will focus on the traces feature.
The fundamental unit of data in OpenTelemetry is the "signal." The purpose of OpenTelemetry is to collect, process, and export these signals, which are system outputs describing the underlying activity of the operating system and applications running on a platform. A signal can be something you want to measure at a specific point in time (e.g., temperature, memory usage), or an event that traverses components of your distributed system that you wish to trace. You can group different signals to observe the internal workings of the same piece of technology from various angles (source: https://opentelemetry.io/docs/concepts/signals/). This article will demonstrate how to emit signals associated with traces (the path of a request through your application) from IRIS to OTel collectors.
OpenTelemetry is an excellent choice for monitoring and tracing your IRIS environment and source code because it is supported by more than 40 observability vendors. It is also integrated by many libraries, services, and applications, and adopted by numerous end users (source: https://opentelemetry.io/docs/).

  • Microservices developed in Java, .NET, Python, NodeJS, InterSystems ObjectScript, and dozens of other languages can send telemetry data to an OTel collector with the help of the remote endpoints and ports (in our example, we will use an HTTP port).
  • Infrastructure components can also send data, particularly data about performance, resource usage (processor, memory, etc.), and other relevant information for monitoring these components. For InterSystems IRIS, data is collected by Prometheus from the /monitor endpoint and can be transferred to an OTel Collector.
  • APIs and database tools can also send telemetry data. Some database products are capable of doing it automatically (instrumentalization).
  • The OTel Collector receives the OTel data and stores it in a compatible database and/or forwards it to monitoring tools (e.g., Jaeger).


How InterSystems IRIS Sends Monitoring Data to Jaeger

 

Starting with version 2025, InterSystems launched support for OpenTelemetry (OTel) in its monitoring API. This new functionality includes the emission of telemetry data for tracing, logging, and environment metrics. This article discusses sending tracing data to Jaeger via OTel. You can find more details at https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=AOTEL&ADJUST=1.
While some programming languages and technologies support automatic OTel data transmission (automatic instrumentation), for IRIS, you need to write specific code instructions to enable this functionality. Download the source code of the sample application from https://openexchange.intersystems.com/package/iris-telemetry-sample, open your IDE, and follow the steps below:

1. Navigate to the dc.Sample.REST.TelemetryUtil class. The SetTracerProvider method initializes a tracer provider, allowing you to set the name and version of the monitored service:

/// Set tracer provider
ClassMethod SetTracerProvider(ServiceName As %String, Version As %String) As %Status
{
    Set sc = $$$OK
    set attributes("service.name") = ServiceName
    set attributes("service.version") = Version
    Set tracerProv = ##class(%Trace.TracerProvider).%New(.attributes)
    Set sc = ##class(%Trace.Provider).SetTracerProvider(tracerProv)
    Quit sc
}

2. The next step is to retrieve the created tracer provider instance:
 

/// Get tracer provider
ClassMethod GetTracerProvider() As %Trace.Provider
{
    Return ##class(%Trace.Provider).GetTracerProvider()
}

3. This sample will monitor the PersonREST API on the /persons/all endpoint. Go to the dc.Sample.PersonREST class (GetAllPersons class method):

/// Retreive all the records of dc.Sample.Person
ClassMethod GetAllPersons() As %Status
{


    #dim tSC As %Status = $$$OK
    do ##class(dc.Sample.REST.TelemetryUtil).SetTracerProvider("Get.All.Persons", "1.0")
    set tracerProv = ##class(dc.Sample.REST.TelemetryUtil).GetTracerProvider()
    set tracer = tracerProv.GetTracer("Get.All.Persons", "1.0")

4. The tracer provider has just created a monitoring service called Get.All.Persons (with version 1.0), and obtained the tracer instance with GetTracer.
5. The sample should create a root Span as follows:
 

    set rootAttr("GetAllPersons") = 1
    set rootSpan = tracer.StartSpan("GetAllPersons", , "Server", .rootAttr)
    set rootScope = tracer.SetActiveSpan(rootSpan)

6. Spans are pieces of the tracing flow. Each span must be mapped to a piece of source code you want to analyze.
7. SetActiveSpan is mandatory to set the current span that is being monitored.
8. Now, the sample creates some child spans mapped to important pieces of the flow:
 

    set childSpan1 = tracer.StartSpan("Query.All.Persons")
    set child1Scope = tracer.SetActiveSpan(childSpan1)
    Try {
        Set rset = ##class(dc.Sample.Person).ExtentFunc()
        do childSpan1.SetStatus("Ok")
    } Catch Ex {
        do childSpan1.SetStatus("Error")
    }
    do childSpan1.End()
    kill childSpan1

9. This first child span monitors the query for all persons in the database. To create a child span, the sample exploits StartSpan with a suggested title (Query.All.Persons). The active span must then be set to the current child span, which the sample achieves using SetActiveSpan with the childSpan1 reference.
10. The sample executes the business source code (Set rset = ##class(dc.Sample.Person).ExtentFunc()). If the operation is successful, it sets the status to "Ok"; otherwise, it sets it to "Error."
11. The sample ends the monitoring of this piece of code using the End method and killing the childSpan1 reference.
12. You can repeat this procedure for all other code segments you wish to scrutinize:

To monitor retrieving a person by ID (get person details):

set childSpan2 = tracer.StartSpan("Get.PersonByID")
set child2Scope = tracer.SetActiveSpan(childSpan2)
Set person = ##class(dc.Sample.Person).%OpenId(rset.ID)

       To observe the Age calculation (class dc.Sample.Person, method CalculateAge):
 

set tracerProv = ##class(dc.Sample.REST.TelemetryUtil).GetTracerProvider()
set tracer = tracerProv.GetTracer("Get.All.Persons", "1.0")
set childSpan1 = tracer.StartSpan("CalculateAge")
set child1Scope = tracer.SetActiveSpan(childSpan1)

To survey the Zodiac Sign definition (class dc.Sample.Person, method CalculateZodiacSign):
 

set tracerProv = ##class(dc.Sample.REST.TelemetryUtil).GetTracerProvider()
set tracer = tracerProv.GetTracer("Get.All.Persons", "1.0")
set childSpan1 = tracer.StartSpan("GetZodiacSign")
set child1Scope = tracer.SetActiveSpan(childSpan1)

 
Configure IRIS and Jaeger Containers

1. Create the OTel collector container (on docker-composer.yml):
 

# --- 2. OpenTelemetry Collector ---
  otel-collector:
    image: otel/opentelemetry-collector-contrib:latest
    command: ["--config=/etc/otel-collector-config.yml"]
    volumes:
      - ./otel-collector-config.yml:/etc/otel-collector-config.yml
    ports:
      - "4317:4317" # OTLP gRPC 
      - "4318:4318" # OTLP HTTP
      - "9464:9464" # Metrics
    depends_on:
      - iris
      - jaeger 

2. Adjust the OTel collector configuration file:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: "0.0.0.0:4317"
      http:
        endpoint: "0.0.0.0:4318"
exporters:
  otlp:
    endpoint: jaeger:4317 # O nome do serviço 'jaeger' do docker-compose para o Collector gRPC
    tls:
      insecure: true
  prometheus:
    endpoint: "0.0.0.0:9464"
  debug: {}
processors:
  batch: # Processador para agrupar traces em batches
    send_batch_size: 100
    timeout: 10s
connectors:
  spanmetrics: # O conector SpanMetrics
service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch] # Traces são processados para gerar métricas
      exporters: [otlp, spanmetrics]
    metrics:
      receivers: [otlp, spanmetrics]
      exporters: [prometheus]
    logs:
      receivers: [otlp]
      exporters: [debug]

3. The OTel collector will receive monitoring data at the following address:

http:
        endpoint: "0.0.0.0:4318"

4. The OTel collector will send Exporters to Jaeger as shown below:

exporters:
  otlp:
    endpoint: jaeger:4317 # O nome do serviço 'jaeger' do docker-compose para o Collector gRPC
    tls:
      insecure: true

5. The OTel service will create a pipeline to receive and send monitoring data:
 

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlp]

6. In docker-compose.yml, the sample will build an IRIS container, setting the OTEL_EXPORTER_OTLP_ENDPOINT with the OTel collector's address:

iris:
    build:
      context: .
      dockerfile: Dockerfile
    restart: always
    ports:
      - 51773:1972
      - 52773:52773
      - 53773
    volumes:
      - ./:/home/irisowner/dev
    environment:
    - ISC_DATA_DIRECTORY=/home/irisowner/dev/durable
    - OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318

 
7. Finally, a Jaeger container is created to receive data from the OTel collector and provide a UI for users to monitor and trace the IRIS OTel data:

jaeger:
    image: jaegertracing/all-in-one:latest
    ports:
      - "16686:16686" # Jaeger UI
      - "14269:14269" # Jaeger Metrics
      - "14250:14250" # Jaeger Collector gRPC 
    environment:
      - COLLECTOR_OTLP_ENABLED=true

 
There are additional port options for Jaeger that allow you to work with multiple types of collectors. Below you can see the breakdown of the possible exposed ports:

  • 6831/udp: Accepts jaeger.thrift spans (Thrift compact)
  • 6832/udp: Accepts jaeger.thrift spans (Thrift binary)
  • 5778: Jaeger configuration
  • 16686: Jaeger UI
  • 4317: OpenTelemetry Protocol (OTLP) gRPC receiver
  • 4318: OpenTelemetry Protocol (OTLP) HTTP receiver
  • 14250: Accepts model.proto spans over gRPC
  • 14268: Accepts jaeger.thrift spans directly over HTTP
  • 14269: Jaeger health check
  • 9411: Zipkin compatibility

Running the Sample

1. Copy the source code of the sample: 

$ git clone https://github.com/yurimarx/iris-telemetry-sample.git

2. Open the terminal in this directory and run the code below:

$ docker-compose up -d --build

3. Create some fake testing data. To do that, open IRIS terminal or web terminal on /localhost:52773/terminal/ and call the following:

USER>do ##class(dc.Sample.Person).AddTestData(10)

4. Open http://localhost:52773/swagger-ui/index.html and execute the endpoint /persons/all.

 

Analyzing and Tracing the IRIS Code on Jaeger

1. Go to Jaeger at http://localhost:16686/search.

2. Select Get.All.Persons in the "Service" field, and click "Find Traces."


3. Click on the found trace to see its details.


4. Observe the timeline of the selected trace.


5. On the top right, select "Trace Graph":


6. Analyze the dependencies:


7. This dependency analysis is crucial for identifying problems in remote systems/services within your environment, especially if different distributed services use the same service name.
8. Now, select the “Framegraph”:


9. Observe all components of the monitored transaction flow in a graphical table:


10. With all these Jaeger resources, you can effectively resolve performance problems and identify the source of errors.

 

Learn More

To delve deeper into distributed tracing, consider the following external resources:

  • Mastering Distributed Tracing (2019) by Yuri Shkuro: A blog post by Jaeger's creator explaining the history and architectural choices behind Jaeger. The book provides in-depth coverage of Jaeger's design and operations, as well as distributed tracing in general.
  • Take Jaeger for a HotROD ride: A step-by-step tutorial demonstrating how to use Jaeger to solve application performance problems.
  • Introducing Jaeger: An (old) webinar introducing Jaeger and its capabilities.
  • Detailed tutorial about Jaeger: https://betterstack.com/community/guides/observability/jaeger-guide/ 
  • Evolving Distributed Tracing at Uber.
  • Emit Telemetry Data to an OpenTelemetry-Compatible Monitoring Tool by InterSystems documentation.
  • Modern Observability with InterSystems IRIS & OpenTelemetry: A video demonstrating how to work with OpenTelemetry and IRIS: https://www.youtube.com/watch?v=NxA4nBe31nA
  • OpenTelemetry on GitHub: A collection of APIs, SDKs, and tools for instrumenting, generating, collecting, and exporting telemetry data (metrics, logs, and traces) to help analyze software performance and behavior: https://github.com/open-telemetry.
Discussion (0)1
Log in or sign up to continue