Question
· Aug 13

Inconsistency in Caché Database: Mismatch Between Total Count Based on Unique Identifiers, Deduplicated Count, and Conditional Query Results

In the Caché database, when calculating the total count based on the unique identifier of a record, the quantity is over 1.2 million. After removing duplicates based on the unique identifier and then calculating the total count, the quantity is over 400,000. When grouping by the unique identifier, it can be observed that the count for this identifier is not one. However, when performing a conditional query based on the identifier, only one record can be retrieved. Why is this the case?

Product version: Caché 2016.1
Discussion (7)3
Log in or sign up to continue

In SMP (System Management Portal), you step to EXLORER and then step into SQL
where you select your TABLE. and can rebuild index


Furthermore, every persistent class  has by default
• classmethod %BuildDeferredIndices
• classmethod %BuildIndices
• classmethod %BuildIndicesAsync

Next variant : use $SYSTEM.OBJ.ValidateIndices()
Details described here  Fix broken index 8 years ago, still valid
 

Well, you have to do it yourself.
Suggestion: Keep a list of the indices processed and skip all followers
For the list you need a small Stored Procedure that you add to
your SQL SELECT in the WHERE clause.

CREATE PROCEDURE SQLUSER.DUPL(value VARCHAR, id INTEGER)
RETURNS INTEGER
LANGUAGE OBJECTSCRIPT
{
 set used=$d(^||dupl(value))
 set ^||dupl(value,id)=$i(^||dupl(value))  
 quit used
}

And in the SELECT

SELECT id, sickindex, . . . . . 
FROM your.data 
WHERE DUPL(sickindex,id) < 1

As a side effect, you create a list of affected indices.
I used a PPG to avoid the need to clear it before use.
If you are interested in the duplicate, you need to change the global name
and add some cleanup before use