Article
· Apr 8 10m read

SQL statistics with Grafana, InfluxDB, and Telegraf

Introduction

Database performance has become a critical success factor in a modern application environment. Therefore identifying and optimizing the most resource-intensive SQL queries is essential for guaranteeing a smooth user experience and maintaining application stability. 

This article will explore a quick approach to analyzing SQL query execution statistics on an InterSystems IRIS instance to identify areas for optimization within a macro-application.

Rather than focusing on real-time monitoring, we will set up a system that collects and analyzes statistics pre-calculated by IRIS once an hour.  This approach, while not enabling instantaneous monitoring, offers an excellent compromise between the wealth of data available and the simplicity of implementation. 

We will use Grafana for data visualization and analysis, InfluxDB for time series storage, and Telegraf for metrics collection.  These tools, recognized for their power and flexibility, will allow us to obtain a clear and exploitable view.

More specifically, we will detail the configuration of Telegraf to retrieve statistics. We will also set up the integration with InfluxDB for data storage and analysis, and create customized dashboards in Grafana. This will help us quickly identify queries requiring special attention.

To facilitate the orchestration and deployment of these various components, we will employ Docker.

logos.png

Prerequisites

Before you start, make sure you have the following:

  • Git: Git is required to clone the project repository containing configuration files and scripts.
  • Docker or Docker Desktop: Docker can be used to containerize InfluxDB, Telegraf, and Grafana applications, making them easier to deploy and manage.
  • InterSystems IRIS instance:At least version 2022.1, ideally 2023.1 or higher, with the sql-stats-api package installed.  This package is essential for exposing IRIS SQL statistics and enabling Telegraf to collect them. [Link to OpenExchange]

Although our docker-compose includes an IRIS instance, it will not contain any SQL statistics data since it is freshly built and started. Therefore it will not be a convenient choice for testing the system. That is why we strongly recommend that you have another “active” IRIS instance (the one with a history of SQL queries) to have the ability to visualize real data and test the analysis tool.

About statistics

IRIS collects SQL query execution statistics at hourly and daily granularity. Hourly statistics identify performance variations throughout the day, whereas daily statistics provide an overview of database activity.

Below you can see the data we will collect for each SQL query:

  • Number of executions: It Indicates the number of times the query has been executed.
  • Total execution time: It measures the total execution time of the query.
  • Variance of execution times: It is used to identify performance variations and one-off problems.
  • Total number of rows returned (RowCount): Available for IRIS 2023.1 and higher, this metric indicates the total number of rows returned by the query. It can help you identify resource-hungry queries.
  • Total number of commands executed: Also available for IRIS 2023.1 and up, this metric facilitates a more detailed analysis of database activity, and pinpoints queries that could be optimized by reducing the number of operations.

This information can be accessed via the following tables:

  • INFORMATION_SCHEMA.STATEMENT_DAILY_STATS
  • INFORMATION_SCHEMA.STATEMENT_HOURLY_STATS

These tables have been available since IRIS 2022.1. Below we placed an example of an SQL query to retrieve statistics:

SELECT ds.* 
FROM INFORMATION_SCHEMA.STATEMENT_DAILY_STATS ds
INNER JOIN INFORMATION_SCHEMA.STATEMENTS st On ds.Statement = st.Hash


SELECT DATEADD('hh',"Hour",$PIECE(hs."Day",'||',2)) As DateTime, hs.*
FROM INFORMATION_SCHEMA.STATEMENT_HOURLY_STATS hs
INNER JOIN INFORMATION_SCHEMA.STATEMENTS st On $PIECE(hs."Day",'||',1) = st.Hash

For versions older than IRIS 2022.1, I highly recommend the article by David Loveluck, which explains how to retrieve similar statistics.

Architecture

The project is based on the interaction of four key components: IRIS, Grafana, InfluxDB, and Telegraf. The diagram below illustrates the overall architecture of the system and the flow of data between various components:

archi.png

  • InterSystems IRIS: It is the instance we will utilize to retrieve statistics.
  • Package sql-stats-api: This ObjectScript package exposes IRIS statistical data via the REST API. It offers two output formats: JSON for general use and Line Protocol, a format optimized for rapid ingestion of time series data into InfluxDB.
  • Telegraf: It is a metrics collection agent that provides the link between IRIS and InfluxDB. In this project, we will use two instances of Telegraf:
    • One agent periodically queries the IRIS REST API to retrieve SQL statistics in real time.
    • Another agent operates in “directory scan” mode. It monitors a directory with stored files and transmits them to InfluxDB, enabling the integration of data inaccessible via the REST API.
  • InfluxDB:  This time series database stores and manages the SQL statistics collected by Telegraf since its architecture is optimized for this data type. InfluxDB also offers native integration with Grafana, facilitating data visualization and analysis. We chose InfluxDB over Prometheus because the latter is more geared towards real-time monitoring and is not well suited for storing aggregated data, e.g., sums or averages per hour or day, which are essential for our analysis.
  • Grafana: It is a visualization tool that lets you create customized, interactive dashboards to analyze SQL performance. It retrieves data from InfluxDB and offers a variety of graphs and widgets to visualize statistics clearly and usably.

Installation

Start by cloning the repository:

git clone https://github.com/lscalese/iris-sql-dashboard.git
cd irisiris-sql-dashboard

Setting up the environment

This project uses Docker to orchestrate Grafana, InfluxDB, Telegra, and IRIS. For security reasons, such sensitive information as API keys and passwords is stored in a .env file.

Create the .env file using the example provided below:

cp .env.example .env

Edit the .env file to configure the variables:

Variable configuration

  • TZ: Time zone. You should modify this variable according to your time zone to ensure correct time-stamped data.
  • DOCKER_INFLUXDB_INIT_PASSWORD: This is the Administrator password to access InfluxDB.
  • IRIS_USER: It is an IRIS user of the IRIS Docker instance (_system by default).
  • IRIS_PASSWORD : It is an IRIS Docker instance password (SYS by default).

API keys allow the following connections:

  • GRAFANA_INFLUX_API_KEY : Grafana <-> InfluxDB.
  • TELEGRAF_INFLUX_API_KEY : Telegraf <-> InfluxDB.

API Key Generation

For security reasons, InfluxDB requires API keys for authentication and authorization.  Those keys are used to identify and authorize various components (Telegraf, Grafana) to access Influx DB.

The init-influxdb.sh script, included in the repository, facilitates the generation of these keys.  It will be run automatically the first time the infxludb2 container is started:

docker compose up -d influxdb2

After a few seconds, the .env file will be updated with your generated API keys.

Note: this step should only be performed the first time you start the container.

Check whether you have access to the InfluxDB administration interface via the URL http://localhost:8086/

Log in with the “admin” login and password specified in the “DOCKER_INFLUXDB_INIT_PASSWORD” environment variable in the “.env” file. When Browsing “Load Data >> Buckets”, you should discover a pre-configured “IRIS_SQL_STATS” buckets.   

influxdb-2.png

While going through “Load Data >> API Tokens”, you should locate our 2 API keys: “Grafana_IRIS_SQL_STATS” and “Telegraf_IRIS_SQL_STATS”:

influxdb-3.png

The environment is ready now, and we can move on to the next step!

Start

Since the environment has been configured and the API keys have been generated, you can finally launch the container set. Do to that, run the following command in the project root directory:

docker compose up -d

This command will start all the services defined in the docker-compose.yml file in the background: InfluxDB, Telegraf, Grafana, and the IRIS instance.

Dashboard Grafana

Grafana is now available at http://localhost:3000.

Connect to Grafana

Open your web browser and go to http://localhost:3000. The default username and password are admin/admin. However, you will be prompted to change the password the first time you log in.

grafana-login.png

Checking the InfluxDB data source

The InfluxDB data source is pre-configured in Grafana. All you need to do is to check if it is working properly.

Go to “Connections > Data sources”.

grafana-ds.png

You should see a data source named “influxdb”.Click it to modify it.

Then click “Save & Test”. The message “Datasource is working. 1 bucket found” should appear on the screen now.

grafana-datasources-influxdb.png

Exploring dashboards

At this point, you have verified that communication between Grafana and InfluxDB is established, meaning you can explore the predefined dashboards.

Proceed to “Dashboards”.

grafana-dashboard-list.png

You'll find encounter two predefined dashboards:

  • InfluxDB - SQL Stats: This dashboard displays general SQL query execution statistics, e.g., number of executions, total execution time, and execution time variance.
  • InfluxDB - SQL Stats Details: This dashboard provides more detailed information on each SQL query, e.g., the total number of returned rows or executed commands.

Why dashboards are empty

If you open the dashboards, you will notice that they are empty.  This is because our Telegraf agent is currently connected to the IRIS instance provided in the Docker repository, which contains no statistical data in its tables. SQL statistics are only collected if the IRIS instance is active and retains a history of SQL queries.

In the next section, we will explore how to inject data into the IRIS instance to view statistics in Grafana.

 

Telegraf

The monitoring system operates two Telegraf agents with specific roles:

  • telegraf-iris.conf:This agent collects real-time data from an active IRIS instance. It queries the IRIS REST API to retrieve SQL statistics to send them to InfluxDB.
  • telegraf-directory-scan.conf: This agent integrates historical data stored in files. It monitors the ./telegraf/in/ directory, reads files containing SQL statistics, and sends them to InfluxDB.

To collect real-time data, you should connect Telegraf to an active IRIS instance with the installed sql-stats-api package.  This package exposes SQL statistics via a REST API, enabling Telegraf to access them.

Configuring telegraf-iris.conf

To connect Telegraf to your IRIS instance, you must to modify the ./telegraf/config/telegraf-iris.conf. Belowyou can find an example of the configuration:

[[inputs.http]]
  ## One or more URLs from which to read formatted metrics
  urls = [
    "http://iris:52773/csp/sqlstats/api/daily",
    "http://iris:52773/csp/sqlstats/api/hourly"
  ]
  ## HTTP method
  method = "GET"

  ## Optional HTTP headers
  headers = {"Accept" = "text/plain"}
  ## Optional HTTP Basic Auth Credentials
  username = "${IRIS_USER}"
  password = "${IRIS_PASSWORD}"
  data_format = "influx"

Ensure that the ${IRIS_USER} and ${IRIS_PASSWORD} are correctly defined in your .env file.
Note: You can copy the file and edit the parameters to connect Telegraf to several IRIS instances.

Restart Telegraf:

After modifying the configuration file, you need to restart the Telegraf container for the changes to take effect:

docker compose up -d telegraf --force-recreate

Retrieving historical data

To retrieve historical SQL statistics, use the ObjectScript CreateInfluxFile method on your IRIS instance:

; Adapt the path to your needs
Set sc = ##class(dc.sqlstats.services.SQLStats).CreateInfluxFile("/home/irisowner/dev/influxdb-lines.txt",,1)

This script will write the SQL statistics history in text files, with a maximum length of 10,000 lines per file. You can then place those files in the ./telegraf/in/ directory to process them and inject them into InfluxDB.

Checking data injection

You can verify that data has been accurately injected into InfluxDB with the help of the web interface. Go to the “Data Explorer” and check it out:

influxdb-explorer.png

Data visualization in Grafana

Once the data has been injected, you can view it in your provisioned Grafana dashboards.

grafana-dashboard-daily-stats.png

grafana-dashboard-daily-details.png

We have come to the end of this article. I hope it was useful for you and taught you how to easily set up a system for monitoring and analyzing SQL statistics on your IRIS instances.

As you might have noticed, this article focused on the practical aspects of configuring and employing various tools. We have not researched all the details of the inner workings of InfluxDB, the Line Protocol format, or the Flux Query language, nor did we examine the multitude of plugins available for Telegraf.

Those topics, however fascinatingthey are, would have required a much longer article.  I strongly encourage you to consult the official documentation for InfluxDB Get started with InfluxDB and Telegraf Plugin Directory to deepen your knowledge and discover all the possibilities offered by these tools.

Do not hesitate to share your experiences in the comments.

Thanks for reading, and see you soon!

Discussion (0)1
Log in or sign up to continue