Article
· Apr 23 3m read

Testing Metadata Inconsistencies in InterSystems IRIS Using the DATATYPE_SAMPLE Database (Part II) - Silent Corruption

The first part of this article provides all the background information. It also includes links to the DATATYPE_SAMPLE database, which you can use to follow along with the examples.

In that section, we explored an error type ("Access Failure") that is easy to detect, as it immediately triggers a clear error message when attempting to read the data via the database driver.

The errors discussed in this section are more subtle and harder to detect. I’ve referred to them as “Silent Corruption” and “Undetected Mutation”

Let’s start with “Silent Corruption”:
In the Employee table of the DATATYPE_SAMPLE database, there is a deliberately manipulated record that demonstrates this behavior – it’s the row with ID = 110. At first glance – and even at second glance – no issues are apparent. Neither the database driver nor the query tool indicates a problem if read this row.

Only upon closer inspection does it become clear that the value in the red-marked cell does not match the transmitted (and defined) metadata.

The column "Name" is defined as VARCHAR(50), but the actual value is 60 characters long!

 

There are scenarios where this behavior doesn’t cause any issues — for example, when the driver handles such inconsistencies leniently.
However, problems can arise when downstream systems rely on the provided metadata. If further processing is based on these metadata definitions, errors may occur when the actual content doesn’t conform to the agreed interface.

A typical example involves ETL tools, which often generate target tables or define transformations based on metadata.

The following SQL query can be used to identify records where the content deviates from the defined metadata:

SELECT
 Name,
 CASE WHEN LENGTH(Name) > 50 THEN 1 ELSE 0 END AS Name_LENGTH_CHECK
,SSN,
 CASE WHEN LENGTH(SSN) > 50 THEN 1 ELSE 0 END AS SSN_LENGTH_CHECK
FROM SQLUser.Employee
WHERE 
      LENGTH(Name) > 50
OR    LENGTH(SSN) > 50

If you execute this query, it will return only the rows that contain errors. In each row, the faulty cell will be marked with a 1 if the value exceeds the length defined by the metadata.
 

Now let’s take a look at the next error type: “Undetected Mutation.”
To demonstrate this issue, the DATATYPE_SAMPLE database includes a faulty record specifically changed to illustrate this behavior.

The record in question is the row with ID = 120

Again, neither the database driver nor the query tool will indicate a problem when reading this row.

 

In this case, the value even appears to match the metadata! The column is defined as INTEGER, and the row returns an integer value (in this example: 0) in that cell. 

However, this value is not actually stored in the database! A direct look at the underlying Global reveals the true content. Through manipulation, a string value was injected into this field.

 

SELECT
 CAST(Age AS VARCHAR(255)) AS Age,
 ISNUMERIC(CAST(Age AS VARCHAR(255))) AS Age_ISNUMERIC
FROM SQLUser.Employee
WHERE
    ISNUMERIC(CAST(Age AS VARCHAR(255))) = 0


If you execute this query, it will return only those rows that contain metadata inconsistencies. In each result row, the problematic cell is flagged with a 0 if the value cannot be interpreted as numeric by the driver

 

Final Thoughts

These scenarios highlight how seemingly well-formed data can conceal subtle inconsistencies—especially in legacy systems that bypass standard safeguards. While "Access Failures" are easy to spot, issues like "Silent Corruption" and "Undetected Mutation" often go unnoticed but can cause serious problems downstream, particularly in systems that rely on strict metadata compliance.

The DATATYPE_SAMPLE database and the diagnostic queries shared here provide a foundation for identifying such issues manually. But let’s face it—writing these checks by hand is tedious and error-prone.

Fortunately, SQL DATA LENS (min. version 3.22) makes this process much easier. 😉With just a click, it can generate comprehensive integrity checks for Tables, Views, and Stored Procedures—saving time and helping you stay ahead of hidden data quality issues.

Discussion (0)1
Log in or sign up to continue