Question
· May 27, 2021

JDBC counts don't match

I'm using a JDBC driver to connect PGSQL -> Cache.  I'm noticing when I run SELECT queries and COUNT(*) command against the same table, I get different result sets.  I'm pretty new to Cache in general - so I'm trying to understand why these would be different. 

Examples (TransID and InvNum should occur in every "row"):

SELECT COUNT(*) FROM ACCT.Services = 1,090,324 WHERE ACCT.Dept = 483

SELECT TransID FROM ACCT.Services = 1,085,776 WHERE ACCT.Dept = 483

SELECT InvNum FROM ACCT.Services = 586,023 WHERE ACCT.Dept = 483

(I tried the same thing in MSSQL via  LinkedServer and it unsurprisingly yielded the same result.)

I'm suspecting this has to do with indexes, possibly?  The only reason I say this is because the TransID seems to be the main index on this table.  Does the index only include 'active rows' entries?  Appreciate any insights.  Thanks.

Product version: Caché 2015.1
Discussion (8)0
Log in or sign up to continue

Thanks for quick reply, David.  I wanted to double-check a few things based on your recommendations. 

No, there are no grouping or sorting functions in the syntax.  The only filtering I'm using against the data is the Dept = 483.  

In my additional testing, I tried some additional syntax and noticed the following:

* SELECT COUNT(*) FROM ACCT.Services WHERE ACCT.Dept = 483 yields 1,085,776.

* SELECT COUNT(*), TransID FROM ACCT.Services WHERE ACCT.Dept = 483 AND TransID IS NULL yields 1,085,776.

* SELECT COUNT(*), TransID, ActnCode FROM ACCT.Services WHERE ACCT.Dept = 483 yields 1,085,776.

* SELECT COUNT(*), TransID, ActnCode, Comment WHERE ACT.Dept = 482 yields a result of 1,090,324.

* SELECT COUNT(*), TransID, ActnCode, Comment...[All the rest of the columns; too long to included here] WHERE ACT.Dept = 482 yields a result of 1,090,324.

I did some additional SELECT testing with other fields and consistently saw that if I only added index fields (e.g. TransID and ActnCode - either both or separately) I get the same 1,085,776 result.  If I add a non-indexed field (e.g. Comment, TransUserID) to the SELECT listing, I get the higher (1,090,324) result.  This holds true in both of those cases whether I have the COUNT(*) included in the SELECT statement or not.

Thanks again.

I'm not sure if this will help cinch the indexing being the culprit or not, but I pulled just the TransIDs out (into a table by themselves) and then tried to do a SELECT * FROM ACCT.Services WHERE TransID = ######### AND ACCT.Dept = 482 to pull specif TransID rows.  I spot checked a few that corresponded to the results in the SELECT COUNT(*), TransID, ActnCode, Comment WHERE ACT.Dept = 482 query and have already hit about three that didn't have any results.  I'm assuming at this point it is because they're somehow part of the index, but not actual "active rows".  My SQL-brain can't comprehend how I can have a query display a result but not find the detail that corresponds with that row id...but I'm sure it's my lack of Cache-101 knowledge...

For performance reasons, it's possible to define an Index in a way, that some of the columns will be as part of the index itself, just for search, and some data could be in the data part of that index, which will be used to display in the result if requested. So, if your index is somehow corrupted, the SQL engine will expect the values there, and will look for it, so, it will not go to the place where the data originally seat. And as a result, you may not see some of the data in the output rows.