Article
· Apr 22 4m read

Testing Metadata Inconsistencies in InterSystems IRIS Using the DATATYPE_SAMPLE Database

When using standard SQL or the object layer in InterSystems IRIS, metadata consistency is usually maintained through built-in validation and type enforcement. However, legacy systems that bypass these layers—directly accessing globals—can introduce subtle and serious inconsistencies.

Understanding how drivers behave in these edge cases is crucial for diagnosing legacy data issues and ensuring application reliability.

The DATATYPE_SAMPLE database is designed to help analyze error scenarios where column values do not conform to the data types or constraints defined in the metadata. The goal is to evaluate how InterSystems IRIS and its drivers (JDBC, ODBC, .NET) and different tools behave when such inconsistencies occur. In this post, I’ll focus on the JDBC driver.


What's the Problem?

Some legacy applications write directly to globals. If a relational model (created via CREATE TABLE or manually defined using a global mapping) is used to expose this data, the mapping defines the underlying values conform to the declared metadata for each column.

When this assumption is broken, different types of problems may occur:

  1. Access Failure: A value cannot be read at all, and an exception is thrown when the driver tries to access it.
  2. Silent Corruption: The value is read successfully but does not match the expected metadata.
  3. Undetected Mutation: The value is read and appears valid, but was silently altered by the driver to fit the metadata, making the inconsistency hard to detect.

Simulating the Behavior

To demonstrate these scenarios, I created the DATATYPE_SAMPLE database, available on the InterSystems Open Exchange:
🔗 Package page
🔗 GitHub repo

The table used for the demonstration:

CREATE TABLE SQLUser.Employee (
   ID              BIGINT          NOT NULL AUTO_INCREMENT,
   Age             INTEGER,
   Company         BIGINT,
   DOB             DATE,
   FavoriteColors  VARCHAR(4096),
   Name            VARCHAR(50)     NOT NULL,
   Notes           LONGVARCHAR,
   Picture         LONGVARBINARY,
   SSN             VARCHAR(50)     NOT NULL,
   Salary          INTEGER,
   Spouse          BIGINT,
   Title           VARCHAR(50),
   Home_City       VARCHAR(80),
   Home_State      VARCHAR(2),
   Home_Street     VARCHAR(80),
   Home_Zip        VARCHAR(5),
   Office_City     VARCHAR(80),
   Office_State    VARCHAR(2),
   Office_Street   VARCHAR(80),
   Office_Zip      VARCHAR(5)
);

Example 1: Access Failure

To simulate an inconsistency, I injected invalid values into the DOB (Date of Birth\Datatype DATE) column using direct global access. Specifically, the rows with primary keys 101, 180, 181, 182, 183, 184, and 185 were populated with values that do not represent valid dates.

The values looks like this now:
 
As you can see, a string was appended to the end of a $H (Horolog) value. According to the table's metadata, this column is expected to contain a date—but the stored value clearly isn't one.

So what happens when you try to read this data? Well, it depends on the tool you're using. I tested a few different tools to compare how they handle this kind of inconsistency.

1) SquirrelSQL (SQuirreL SQL Client Home Page)

When SquirrelSQL attempts to access the data, an error occurs. It tries to read all rows and columns, and any cell that contains invalid data is simply marked as "ERROR". Unfortunately, I couldn't find any additional details or error messages explaining the cause.

  

2) SQLWorkbench/J (SQL Workbench/J -  Home)
SQL Workbench/J stops processing the result set as soon as it encounters the first invalid cell. It displays an error message like "Invalid date", but unfortunately, it doesn't provide any information about which row caused the issue.

 
3) DBVisualizer (dbvis) &  DBeaver (dbeaver)

DBVisualizer and DBeaver behave similarly. Both tools continue reading the result set and provide detailed error messages for each affected cell. This makes it easy to identify the corresponding row that caused the issue.

   

4) SQL DATA LENS (SQL Data Lens - a powerful tool for InterSystems IRIS and Caché)

With the latest release of SQL DATA LENS, you get detailed information about the error, the affected row, and the actual database value. As shown in the screenshot, the internal value for the first row in columns DOB is "39146<Ruined>", which cannot be cast to a valid DATE.

SQL DATA LENS also allows you to configure whether result processing should stop at the first erroneous cell or continue reading to retrieve all available data.
 


The next part of this article will shows details about:

Silent Corruption: The value is read successfully but does not match the expected metadata.

Undetected Mutation: The value is read and appears valid, but was silently altered by the driver to fit the metadata, making the inconsistency hard to detect.



Andreas

Discussion (0)2
Log in or sign up to continue