Common Table Expressions (CTEs) in InterSystems IRIS SQL
Abstract
Common Table Expressions (CTEs) provide a structured framework for defining reusable intermediate result sets within SQL statements. InterSystems IRIS implements CTEs via the WITH clause, enabling clearer query composition and modular analytical processing while remaining fully integrated with the IRIS cost-based optimizer.
This article explores the semantics of CTEs in InterSystems IRIS, explains their interaction with query optimization, discusses appropriate deployment scenarios, and presents executable examples illustrating practical patterns for production environments.
-
Introduction
Complex SQL workloads often require multiple logical processing stages (e.g., filtering, aggregation, enrichment, and projection). Without abstraction mechanisms, these operations often result in deeply nested queries that are difficult to interpret and maintain.
Common Table Expressions address this issue by allowing developers to define named query components evaluated within a single SQL statement.
In InterSystems IRIS, CTEs are implemented using the SQL WITH clause and operate as logical query expressions optimized alongside the surrounding statement [1].
2. Common Table Expressions in InterSystems IRIS
2.1 Definition
A Common Table Expression is a named subquery declared before a SELECT statement:
WITH PersonSubset AS (
SELECT ID, Name, Age
FROM Sample.Person
)
SELECT Name
FROM PersonSubset
To execute the example above, run the following statements:
CREATE TABLE Sample.Person (
ID INTEGER,
Name VARCHAR(100),
Age INTEGER
)
CREATE INDEX idx_person_name
ON Sample.Person (Name)
CREATE INDEX idx_person_age
ON Sample.Person (Age)
INSERT INTO Sample.Person (ID, Name, Age)
SELECT 1, 'Alice', 30 UNION
SELECT 2, 'Bob', 45 UNION
SELECT 3, 'Carol', 28 UNION
SELECT 4, 'David', 52
According to the InterSystems IRIS SQL Reference, the WITH clause allows a query expression to be referenced as if it were a subquery within the same statement [1].
Key characteristics:
- Scoped to a single SQL statement.
- Capable of defining multiple CTEs.
- Enhances logical organization.
- Fully participates in query optimization.
The WITH clause allows us to have one or more CTEs:
WITH CTE1 AS (
SELECT ...
)
CTE2 AS (
SELECT ...
)
...
SELECT Name
FROM ...
The term "query optimization" implies that the CTE definition is analyzed together with the enclosing query as a unified execution plan, allowing the IRIS optimizer to rewrite, merge, and reorder operations across CTE boundaries.
2.2 Execution Semantics
A CTE in IRIS represents a logical abstraction rather than a physically stored intermediate dataset.
The optimizer may do the following:
- Inline the CTE.
- Merge predicates.
- Reorder joins.
- Eliminate redundant operations.
Example:
WITH
-- Base filter
OlderPeople AS (
SELECT ID, Name, Age
FROM Sample.Person
WHERE Age >= 18
),
-- Derived computation
AgeGroups AS (
SELECT
ID,
Name,
Age,
CASE
WHEN Age >= 60 THEN 'Senior'
WHEN Age >= 40 THEN 'Middle'
ELSE 'Adult'
END AS AgeCategory
FROM OlderPeople
),
-- Additional filter
MiddleAged AS (
SELECT *
FROM AgeGroups
WHERE AgeCategory = 'Middle'
),
-- Aggregation
Stats AS (
SELECT
COUNT(*) AS TotalPeople,
AVG(Age) AS AvgAge
FROM MiddleAged
)
SELECT *
FROM Stats
Execution plan:
Statement Text
WITH OlderPeople AS ( SELECT ID , Name , Age FROM Sample . Person WHERE Age >= ? ) , AgeGroups AS ( SELECT ID , Name , Age , CASE WHEN Age >= ? THEN ? WHEN Age >= ? THEN ? ELSE ? END AS AgeCategory FROM OlderPeople ) , MiddleAged AS ( SELECT * FROM AgeGroups WHERE AgeCategory = ? ) , Stats AS ( SELECT COUNT ( * ) AS TotalPeople , AVG ( Age ) AS AvgAge FROM MiddleAged ) SELECT * FROM Stats /*#OPTIONS {"DynamicSQLTypeList":"10,10,1,10,1,1,1"} */
Query Plan
Relative Cost = 21000
• Call module E.
• Output the row.
Module: E
• Read index map Sample.Person.idx_person_age, looping on Age (with a range condition) and ID1.
• For each row:
- Accumulate the count(rows).
- Accumulate the count(Age).
- Accumulate the sum(Age).
Execution is equivalent to a single query plan.
SELECT
COUNT(*),
AVG(Age)
FROM Sample.Person
WHERE Age BETWEEN 40 AND 59
Execution plan:
Statement Text
SELECT COUNT ( * ) , AVG ( Age ) FROM Sample . Person WHERE Age BETWEEN ? AND ? /*#OPTIONS {"DynamicSQLTypeList":"10,10"} */
Query Plan
Relative Cost = 271.99
• Call module B.
• Output the row.
Module: B
• Read index map Sample.Person.idx_person_age, looping on Age (with a range condition) and ID1.
• For each row:
- Accumulate the count(rows).
- Accumulate the count(Age).
- Accumulate the sum(Age).
Note that both statements produce the same execution plan, meaning they are expected to have equivalent runtime performance. While the CTE version introduces several intermediate logical steps, the query optimizer simplifies them into a single optimized execution strategy.
Therefore, CTE usage primarily improves clarity and maintainability, whereas performance depends on overall query design.
2.3 Layered CTE Definitions
Multiple CTEs can be chained together to form analytical pipelines.
WITH Base AS (
SELECT ID, Age
FROM Sample.Person
),
Adults AS (
SELECT *
FROM Base
WHERE Age >= 18
)
SELECT *
FROM Adults
Each layer represents one logical transformation stage.
This pattern is particularly useful for reporting and analytical SQL.
2.4 Relationship to Other Query Abstractions
CTEs coexist with other IRIS SQL constructs:
| Mechanism | Scope | Materialization | Typical Use |
|---|---|---|---|
| CTE | Local | Logical | Query structuring |
| Subquery | Local | Logical | Simple filtering |
| View | Persistent | Logical | Reusable abstraction |
| Temporary table | Session | Physical | Reuse and performance control |
Even though CTEs and subqueries are limited in scope, CTEs provide a reusable named result set within a single statement, while subqueries exist only at their point of use and must be rewritten if referenced again.
The right appropriate abstraction choice depends on reuse requirements and data volume.
3. Usage Patterns
The following statements are necessary for the upcoming examples. Run them if you wish to see how they work.
CREATE TABLE DemoCTE.DemoOrders (
ID INTEGER,
Customer VARCHAR(30),
Amount NUMERIC(10,2),
OrderYear INTEGER
)
CREATE INDEX DemoOrdersCustomerIdx ON DemoCTE.DemoOrders(Customer)
CREATE INDEX DemoOrdersAmountIdx ON DemoCTE.DemoOrders(Amount)
CREATE INDEX DemoOrdersYearIdx ON DemoCTE.DemoOrders(OrderYear)
INSERT INTO DemoCTE.DemoOrders
SELECT 1,'Alice',1200,2025 UNION
SELECT 2,'Bob',300,2025 UNION
SELECT 3,'Alice',800,2024 UNION
SELECT 4,'Carol',1500,2025
CREATE TABLE DemoCTE.Sales (
ID INTEGER,
Customer VARCHAR(30),
Amount NUMERIC(10,2),
Year INTEGER
)
CREATE INDEX SalesCustomerIdx ON DemoCTE.Sales(Customer)
CREATE INDEX SalesYearIdx ON DemoCTE.Sales(Year)
CREATE INDEX SalesAmountIdx ON DemoCTE.Sales(Amount)
INSERT INTO DemoCTE.Sales
SELECT 1,'Alice',100,2025 UNION
SELECT 2,'Alice',300,2025 UNION
SELECT 3,'Bob',200,2025 UNION
SELECT 4,'Bob',50,2024 UNION
SELECT 5,'Carol',900,2025
3.1 Query Decomposition
CTEs simplify complex SQL by isolating logical concerns.
WITH
Orders2025 AS (
SELECT Customer, SUM(Amount) AS TotalOrders2025
FROM DemoCTE.DemoOrders
WHERE OrderYear = 2025
GROUP BY Customer
),
HistoricalSales AS (
SELECT Customer, AVG(Amount) AS AvgSales
FROM DemoCTE.Sales
GROUP BY Customer
)
SELECT
o.Customer,
o.TotalOrders2025,
s.AvgSales
FROM Orders2025 o
JOIN HistoricalSales s
ON o.Customer = s.Customer
WHERE o.TotalOrders2025 > s.AvgSales
This approach reduces nesting and improves readability.
To check it out, compare it to its corresponding non-CTE version:
SELECT
o.Customer,
o.TotalOrders2025,
s.AvgSales
FROM (
SELECT Customer, SUM(Amount) AS TotalOrders2025
FROM DemoCTE.DemoOrders
WHERE OrderYear = 2025
GROUP BY Customer
) o
JOIN (
SELECT Customer, AVG(Amount) AS AvgSales
FROM DemoCTE.Sales
GROUP BY Customer
) s
ON o.Customer = s.Customer
WHERE o.TotalOrders2025 > s.AvgSales
It is the same logic. However, it is significantly harder to understand compared to its CTE version.
3.2 Multi-Stage Data Transformation
CTEs naturally express transformation pipelines.
WITH Filtered AS (
SELECT *
FROM DemoCTE.Sales
WHERE Amount > 100
),
Aggregated AS (
SELECT Customer, SUM(Amount) Total
FROM Filtered
GROUP BY Customer
)
SELECT *
FROM Aggregated
ORDER BY Total DESC
All operations remain inside the SQL optimizer.
In general, queries that require multiple logical steps (e.g., filtering, aggregation, and comparison) are good candidates for CTEs since without them, the query becomes deeply nested and more difficult to comprehend.
3.3 Reusing Intermediate Results
A CTE can be referenced multiple times within the same query.
WITH HighSales AS (
SELECT *
FROM DemoCTE.Sales
WHERE Amount > 500
)
SELECT COUNT(*)
FROM HighSales
UNION ALL
SELECT SUM(Amount)
FROM HighSales
Since a CTE is more logical than physical, repeated references may result in re-evaluation depending on optimizer decisions.
4. Performance Considerations
4.1 Optimization Behavior
InterSystems IRIS treats CTEs as part of a unified execution plan rather than independent steps.
Consequences include the following:
- Execution order is optimizer-driven.
- Predicate pushdown may occur.
- CTE boundaries do not mandate materialization.
Predicate pushdown occurs when filtering conditions defined in the outer query are applied earlier amid execution, allowing the optimizer to reduce the number of rows processed before subsequent operations.
For instance, consider the code below:
WITH SalesData AS (
SELECT *
FROM DemoCTE.Sales
)
SELECT *
FROM SalesData
WHERE Customer = 'Alice'
Execution plan:
Statement Text
WITH SalesData AS ( SELECT * FROM DemoCTE . Sales ) SELECT * FROM SalesData WHERE Customer = ? /*#OPTIONS {"DynamicSQLTypeList":"1"} */
Query Plan
Relative Cost = 936
• Read index map DemoCTE.Sales.SalesCustomerIdx, using the given %SQLUPPER(Customer), and looping on ID1.
• For each row:
- Read the master map DemoCTE.Sales.IDKEY, using the given IDKEY value.
- Output the row.
The optimizer recognizes that the filter can be applied earlier and internally rewrites the query:
SELECT *
FROM DemoCTE.Sales
WHERE Customer = 'Alice'
Execution plan:
Statement Text
SELECT * FROM DemoCTE . Sales WHERE Customer = ? /*#OPTIONS {"DynamicSQLTypeList":"1"} */
Query Plan
Relative Cost = 936
• Read index map DemoCTE.Sales.SalesCustomerIdx, using the given %SQLUPPER(Customer), and looping on ID1.
• For each row:
- Read the master map DemoCTE.Sales.IDKEY, using the given IDKEY value.
- Output the row.
The predicate was pushed down into the base table access. You can check this out by noting that both execution plans are identical.
Performance evaluation should therefore rely on execution plan analysis rather than query structure alone.
4.2 Indexing and Cardinality
Performance depends primarily on the underlying tables and data access paths. Therefore, the same best practices that apply to general SQL query optimization also work in this context:
- Index join and filtering columns,
- Avoid unnecessary row expansion,
- Maintain accurate statistics.
By avoiding unnecessary row expansion, you can minimize excessive growth of intermediate result sets (“row expansion”) when designing queries. Although the IRIS optimizer may apply predicate pushdown to filter rows in advance, this optimization is not always possible (e.g., when predicates depend on joins, aggregations, or computed results). Involving selective filters as early as logically possible and indexing join columns appropriately helps prevent large intermediate datasets and improves cardinality estimation.
CTEs themselves neither improve nor degrade performance intrinsically.
5. When CTEs Are Not the Best Choice
While CTEs are effective for logical structuring, some scenarios benefit more from alternative approaches.
Large or Reused Intermediate Results
When intermediate datasets are accessed multiple times, explicit materialization may be preferable:
CREATE GLOBAL TEMPORARY TABLE TopCustomers AS
SELECT Customer, SUM(Amount) Total
FROM DemoCTE.Sales
GROUP BY Customer
Temporary tables avoid repeated computation and provide a predictable execution cost.
Complex Iterative or Hierarchical Processing
Certain workloads require iterative or hierarchical traversal logic. However, since InterSystems IRIS CTE does not currently support recursion, such scenarios could still be implemented using the following:
- self-joins,
- temporary staging tables,
- procedural orchestration with ObjectScript.
It reflects a design characteristic of the IRIS SQL engine and guides developers toward hybrid SQL-plus-ObjectScript solutions when iterative behavior is required.
Extremely Deep CTE Layering
Excessive chaining of CTEs can reduce readability and complicate plan analysis. In such cases, views or staged queries may provide a clearer structure.
6. Complete Executable Examples
Example 1 — Basic CTE
CREATE TABLE IF NOT EXISTS DemoCTE.DemoOrders(
ID INT,
Customer VARCHAR(30),
Amount NUMERIC(10,2),
OrderYear INT
)
INSERT INTO DemoCTE.DemoOrders
SELECT 1,'Alice',1200,2025 UNION
SELECT 2,'Bob',300,2025 UNION
SELECT 3,'Alice',800,2024 UNION
SELECT 4,'Carol',1500,2025
WITH HighValue AS (
SELECT *
FROM DemoCTE.DemoOrders
WHERE Amount > 1000
)
SELECT *
FROM HighValue
Example 2 — Multi-Stage Transformation
WITH Orders2025 AS (
SELECT *
FROM DemoCTE.DemoOrders
WHERE OrderYear = 2025
),
CustomerTotals AS (
SELECT Customer,
SUM(Amount) TotalAmount
FROM Orders2025
GROUP BY Customer
)
SELECT *
FROM CustomerTotals
ORDER BY TotalAmount DESC
Example 3 — Reusing a CTE
WITH HighSales AS (
SELECT *
FROM DemoCTE.DemoOrders
WHERE Amount > 500
)
SELECT COUNT(*) AS NumOrders
FROM HighSales
UNION ALL
SELECT SUM(Amount)
FROM HighSales
Example 4 — Using a Temporary Table Instead
Note: to be able to run this example and observe data, you have to use an SQL client such as DBeaver. If you launch it through the IRIS Portal, the process will change on each execution, and you will not be able to see the temporary data.
CREATE GLOBAL TEMPORARY TABLE DemoCTE.HighSales AS
SELECT *
FROM DemoCTE.DemoOrders
WHERE Amount > 500
SELECT COUNT(*)
FROM DemoCTE.HighSales
SELECT Customer, SUM(Amount)
FROM DemoCTE.HighSales
GROUP BY Customer
Example 5 — Dynamic SQL Generating a CTE (ObjectScript)
Set sql = ""_
"WITH CustomerTotals AS (" _
" SELECT Customer, SUM(Amount) Total" _
" FROM DemoCTE.DemoOrders" _
" GROUP BY Customer" _
") " _
"SELECT * FROM CustomerTotals"
Set stmt = ##class(%SQL.Statement).%New()
$$$ThrowOnError(stmt.%Prepare(sql))
Set result = stmt.%Execute()
If (result.%SQLCODE '= 0) && (result.%SQLCODE '= 100) {
Throw ##class(%Exception.SQL).CreateFromSQLCODE(result.%SQLCODE, result.%Message)
}
While result.%Next() {
Write result.Customer, " ", result.Total, !
}
7. Conclusions
Common Table Expressions in InterSystems IRIS provide an effective mechanism for organizing complex SQL queries and building readable analytical pipelines.
Best practices include the following:
- Using CTEs for logical clarity,
- Validating performance through execution plans,
- Explicitly materializing data when reusage or iteration is required.
- Combining SQL and ObjectScript when workloads extend beyond declarative query patterns.
When applied appropriately, CTEs significantly contribute to maintainable and performant IRIS SQL solutions.
References
[1] InterSystems Corporation. WITH (SQL) — InterSystems IRIS SQL Reference https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=RSQL_with
Footnote
This article was developed with the assistance of Artificial Intelligence tools for drafting and language refinement. All technical validation and final review were performed by the author.
Comments
Muito interessante este artigo, José! Até salvei ele pois, com certeza, vou utiliza-lo! rs
Parabéns!