Written by

IRIS Developer Advocate, Software developer at CaretDev, Tabcorp
MOD
Article Dmitry Maslennikov · 10 hr ago 4m read

Inside $LISTBUILDContestant

Most IRIS developers use $LISTBUILD every day — often without even noticing it.

It is not just a convenient function for building lists. It is also the default internal format used to store row data, global values, and many intermediate structures inside the database engine.

Despite this, the actual binary representation of $LISTBUILD values is rarely discussed. Most developers rely on its behavior, but never look at how the data is really stored.

This article focuses strictly on the binary layout of $LISTBUILD values, based on direct inspection via zzdump.


Element Structure

A $LISTBUILD value is a sequence of elements. Each element is encoded as:

[length][type][payload]

For small elements:

  • length is 1 byte and includes the entire element
  • type is 1 byte
  • payload is variable

Example:

$lb("hello")
→ 07 01 68 65 6C 6C 6F

Endianness

The examples below assume a little-endian platform, which is the most common deployment for IRIS.

IRIS supports both little-endian and big-endian architectures, and the byte order of stored values follows the underlying platform.

This affects:

  • integer payloads
  • decimal mantissas
  • floating-point values (08, 09)
  • UTF-16 encoding

$LISTBUILD is not strictly platform-independent at the byte level. Any external decoder must account for endianness.


Extended Length Encoding

When the element size exceeds one byte capacity, IRIS switches to extended forms:

  • 00 + 2-byte length
  • 00 00 + 4-byte length

Example:

00 01 01 01 ...

Length encoding is therefore variable-width and marker-based.


String Types

Type Meaning
01 ASCII string
02 Unicode string

Examples:

$lb("")       → 02 01
$lb("hello")  → 07 01 68 65 6C 6C 6F
$lb("привет") → 0E 02 ...

Unicode strings are stored as UTF-16.

Surrogate Pairs

Characters outside the Basic Multilingual Plane use UTF-16 surrogate pairs:

$lb("🔟")
→ 06 02 3D D8 1F DD
D83D DD1F → U+1F51F

Unicode values are stored as raw UTF-16 (platform endianness), without compression.


Integer Encoding

Type Meaning
04 non-negative integer
05 negative integer

Encoding is variable-length.

Special cases:

0  → 02 04
-1 → 02 05

Examples:

1     → 03 04 01
255   → 03 04 FF
256   → 04 04 00 01

-2    → 03 05 FE
-256  → 03 05 00
-257  → 04 05 FF FE

Observations:

  • positive integers use unsigned binary representation
  • negative integers use two’s complement with variable width
  • payload grows only when required

Decimal Numbers

Type Meaning
06 positive decimal
07 negative decimal

Structure:

[length][type][scale][mantissa...]
  • scale — 1 byte (decimal exponent)
  • mantissa — variable-length integer

Example:

0.1     → 04 06 FF 01
0.01    → 04 06 FE 01
0.00002 → 04 06 FB 02

Large value:

2^32 + 0.1
→ 08 06 FF 01 00 00 00 0A

Interpretation:

value = mantissa × 10^exponent

Negative values mirror the same structure:

-0.00002 → 04 07 FB FE

Decimal values are stored as scaled integers, preserving exact decimal semantics.


Binary Floating-Point

IRIS supports two binary floating-point encodings.


Compact Float (Type 08)

[length][08][payload...]
  • IEEE 754 single-precision (float32)
  • little/big-endian depending on platform
  • trailing zero bytes are omitted

Examples:

1.5  → 04 08 C0 3F
1.25 → 04 08 A0 3F
0.5  → 03 08 3F
10.0 → 04 08 20 41

IEEE Double (Type 09)

[length][09][8 bytes]
  • IEEE 754 double-precision (float64)
  • fixed 8-byte payload

Example:

$double(0.1)
→ 0A 09 9A 99 99 99 99 99 B9 3F

Summary

$LISTBUILD is not just a helper function — it is a core binary storage format inside IRIS.

It combines:

  • variable-length element encoding

  • compact integer representation

  • decimal values as scaled integers

  • binary floating-point in two forms:

    • compact float32 (08)
    • full float64 (09)

The format is:

  • space-efficient
  • self-delimiting
  • internally consistent across types

While not formally documented, its structure is stable and precise enough to be reverse-engineered and implemented outside IRIS.

Comments

Robert Cemper · 10 hr ago

THANK YOU @Dmitry Maslennikov  for this excellent insight into the probably 
most important internal structure element of IRIS and its data type variations. 

👍👏
 

0
Julius Kavay · 9 hr ago

Just two comments:
"Observations: ... payload grows only when required" is correct but a more correct explanation would be "the whole list structure is created with a minimum memory usage in mind".

The above note (minimum storage size) leads to special cases:

1) an integer 0 is stored in two bytes only
   02 04     (length, type) and not
   03 04 00  (length, type, data) which is also accepted(*)
   
   The same goes for nullstring (which is obvious)
   02 01     (length, type and, of course, no data), ASCII nullstring
   02 02     (length, type) a "Unicode" nullstring, 
   
2) If only the length component is present and equals 1,
   then this indicates a NULL element (i.e. a missing element):

set x=$lb(85,,,0,"","abc")
zzdump x --> 03 04 55 01 01 02 04 02 01 05 01 61 62 63

which breaks down into
03 04 55         $li(x,1) = 85
01               $li(x,2) = <NULL VALUE>
01               $li(x,3) = <NULL VALUE>
02 04            $li(x,4) = 0
02 01            $li(x,5) = "" / nullstring
05 01 61 62 63   $li(x,6) = "abc"

(*) I use this side effect (list use minimal bytes for integers) and ASCII strings are accepted even if their type is unicode) in some of my CallOuts to return results (as an IRIS-List) without explicitly converting C's two byte string into one byte ASCII where it aplies and I return integer values either as four or eith bytes even if the value would fit in two, three or five bytes.

For example, the ReadColumn() method of the excel library class could return something like

set colData = %exl.ReadColumn(3)

zzdump colData
08 02 61 00 62 00 63 00 06 04 55 00 00 00   (ASCII "as" Unicode, 1 byte integer in 4 bytes)

zwrite colData
colData=$lb("abc",85)


Thank you guys at ISC for this wise implementation!
 

1