Written by

MOD
Question Nicky Zhu · Feb 12, 2023

How to optimize slow count distinct SQL

Hi guys,

I'm working on a project with my client.
They have a visit table which has about 7,000,000 records. The table is used in a random search page witch holds 20+ conditions to be combined.
The table is defined as below:

CREATETABLE SQLUser.ST_SEARCH_VISIT (

HQ_ORG_CODE VARCHAR(32),

HQ_ORG_NAME VARCHAR(32),

TENANT_ID VARCHAR(32),

GROUP_PATIENT_SN VARCHAR(32),

ORG_PATIENT_SN VARCHAR(32),

NAMEVARCHAR(64),

SEX_CODE VARCHAR(2),

SEX_NAME VARCHAR(16),

BIRTH_DATE DATE,

MARRY_CODE VARCHAR(2),

MARRY_NAME VARCHAR(16),

IDENTIFY_TYPE_CODE VARCHAR(5),

IDENTIFY_TYPE_NAME VARCHAR(32),

IDENTIFY_NUMBER VARCHAR(50),

BLOOD_TYPE_CODE VARCHAR(2),

BLOOD_TYPE_NAME VARCHAR(12),

MOBILE VARCHAR(50),

MAILING_ADDRESS VARCHAR(127),

VISIT_SERIAL_NO VARCHAR(64),

TABLE_FLAG VARCHAR(1),

VISIT_TYPE_CODE VARCHAR(24),

VISIT_TYPE_NAME VARCHAR(64),

VISIT_DEPT_CODE VARCHAR(32),

VISIT_DEPT_NAME VARCHAR(64),

IN_OUT_VISIT_NO VARCHAR(32),

VISIT_TIME TIMESTAMP,

DISCHARGE_TIME TIMESTAMP,

IN_HOSPITAL_TIME TIMESTAMP,

ICD_CODE VARCHAR(20),

ICD_NAME VARCHAR(64),

HEALTH_CARD_NO VARCHAR(32),

HEALTH_CARD_TYPE VARCHAR(64),

SURG_CODE VARCHAR(20),

SURG_NAME VARCHAR(64),

AGE VARCHAR(10),

DISEASE_CODE INTEGER,

DISEASE_NAME VARCHAR(64),

LastChangedTime TIMESTAMP,

LastCreateTime TIMESTAMP,

GROUPORG VARCHAR(50)

);

CREATEINDEX BITMAP_INDEX_ST_SEARCH_VISIT_BLOOD_TYPE_CODE ON SQLUser.ST_SEARCH_VISIT (BLOOD_TYPE_CODE);

CREATEINDEX BITMAP_INDEX_ST_SEARCH_VISIT_HQ_ORG_CODE ON SQLUser.ST_SEARCH_VISIT (HQ_ORG_CODE);

CREATEINDEX BITMAP_INDEX_ST_SEARCH_VISIT_IDENTIFY_TYPE_CODE ON SQLUser.ST_SEARCH_VISIT (IDENTIFY_TYPE_CODE);

CREATEINDEX BITMAP_INDEX_ST_SEARCH_VISIT_MARRY_CODE ON SQLUser.ST_SEARCH_VISIT (MARRY_CODE);

CREATEINDEX BITMAP_INDEX_ST_SEARCH_VISIT_SEX_CODE ON SQLUser.ST_SEARCH_VISIT (SEX_CODE);

CREATEINDEX BITMAP_INDEX_ST_SEARCH_VISIT_TABLE_FLAG ON SQLUser.ST_SEARCH_VISIT (TABLE_FLAG);

CREATEINDEX BITMAP_INDEX_ST_SEARCH_VISIT_VISIT_DEPT_CODE ON SQLUser.ST_SEARCH_VISIT (VISIT_DEPT_CODE);

CREATEINDEX BITMAP_INDEX_ST_SEARCH_VISIT_VISIT_TYPE_CODE ON SQLUser.ST_SEARCH_VISIT (VISIT_TYPE_CODE);

CREATEINDEX INDEX_PI_PATIENT_INFO_HQ_ORG_CODE ON SQLUser.ST_SEARCH_VISIT (HQ_ORG_CODE,GROUP_PATIENT_SN);

CREATEINDEX INDEX_ST_SEARCH_VISIT_BIRTH_DATE ON SQLUser.ST_SEARCH_VISIT (BIRTH_DATE);

CREATEINDEX INDEX_ST_SEARCH_VISIT_DISCHARGE_TIME ON SQLUser.ST_SEARCH_VISIT (DISCHARGE_TIME);

CREATEINDEX INDEX_ST_SEARCH_VISIT_GROUP_PATIENT_SN ON SQLUser.ST_SEARCH_VISIT (GROUP_PATIENT_SN);

CREATEINDEX INDEX_ST_SEARCH_VISIT_IDENTIFY_NUMBER ON SQLUser.ST_SEARCH_VISIT (IDENTIFY_NUMBER);

CREATEINDEX INDEX_ST_SEARCH_VISIT_IN_HOSPITAL_TIME ON SQLUser.ST_SEARCH_VISIT (IN_HOSPITAL_TIME);

CREATEINDEX INDEX_ST_SEARCH_VISIT_IN_OUT_VISIT_NO ON SQLUser.ST_SEARCH_VISIT (IN_OUT_VISIT_NO);

CREATEINDEX INDEX_ST_SEARCH_VISIT_MOBILE ON SQLUser.ST_SEARCH_VISIT (MOBILE);

CREATEINDEX INDEX_ST_SEARCH_VISIT_NAME ON SQLUser.ST_SEARCH_VISIT (NAME);

CREATEINDEX INDEX_ST_SEARCH_VISIT_ORG_PATIENT_SN ON SQLUser.ST_SEARCH_VISIT (ORG_PATIENT_SN);

CREATEINDEX INDEX_ST_SEARCH_VISIT_VISIT_SERIAL_NO ON SQLUser.ST_SEARCH_VISIT (VISIT_SERIAL_NO);

CREATEINDEX INDEX_ST_SEARCH_VISIT_VISIT_TIME ON SQLUser.ST_SEARCH_VISIT (VISIT_TIME);

CREATEINDEX IdxGpsnorg ON SQLUser.ST_SEARCH_VISIT (GROUP_PATIENT_SN,HQ_ORG_CODE);

CREATEINDEX IdxMapOrg ON SQLUser.ST_SEARCH_VISIT (HQ_ORG_CODE);

CREATEINDEX IdxVisitorg ON SQLUser.ST_SEARCH_VISIT (VISIT_SERIAL_NO,HQ_ORG_CODE);

CREATEINDEX LastChangedTimeIndex ON SQLUser.ST_SEARCH_VISIT (LastChangedTime);

CREATEINDEX LastCreateTimeIndex ON SQLUser.ST_SEARCH_VISIT (LastCreateTime);

CREATEINDEX idxGO ON SQLUser.ST_SEARCH_VISIT (GROUPORG);

 

Now we have a slow SQL as below:

selectcount(1) from

(selectdistinctby (b.VISIT_SERIAL_NO,b.HQ_ORG_CODE) b.VISIT_SERIAL_NO,b.HQ_ORG_CODE  from SQLUser.ST_SEARCH_VISIT b )

Which is used to count distinct visits in about 20 hospitals.

This SQL will take more than 10 seconds to finish on their server.

I did a test on my virtual machines with the same table but with populated data(6,000,000 records) then it will took about 16 seconds to finish the search.

I tried to optimize the sql with the followings

------about 12.5s-------

select count(1) from (
    select distinct by (b.VISIT_SERIAL_NO,b.HQ_ORG_CODE) b.VISIT_SERIAL_NO,b.HQ_ORG_CODE  from %PARALLEL ST_SEARCH_VISIT b
    group by b.HQ_ORG_CODE
)

------about 24s-------

selectSUM(OrgSum) from (

select HQ_ORG_CODE,count(1) as OrgSum from %PARALLEL (

selectdistinctby (VISIT_SERIAL_NO,HQ_ORG_CODE) VISIT_SERIAL_NO,HQ_ORG_CODE

From %PARALLEL ST_SEARCH_VISIT

Groupby HQ_ORG_CODE

)

groupby HQ_ORG_CODE

)

 

So what optimizations could be done to make it faster? thanks.

Comments

Stefan Cronje · Feb 12, 2023

For a start:

select distinct by (VISIT_SERIAL_NO,HQ_ORG_CODE) VISIT_SERIAL_NO,HQ_ORG_CODE
can be changed to
select distinct VISIT_SERIAL_NO,HQ_ORG_CODE

It will do the same.

Secondly:
Will you please remove the %parallel and click on "Show Plan". Post that plan here. It will help to determine where the query is slow. It might be using the wrong index. There are many.

Lastly:
Have you tuned the table and checked the result after that?

0
Nicky Zhu  Feb 12, 2023 to Stefan Cronje

Hi Stefan,

If I change 

select distinct by (VISIT_SERIAL_NO,HQ_ORG_CODE) VISIT_SERIAL_NO,HQ_ORG_CODE

into select distinct VISIT_SERIAL_NO,HQ_ORG_CODE,

The sum number is quite different

actually the second sql is equal to 

it seems to be caused by the group by clause on a bitmap index, if I remove it, the result is then consistent

I tuned the table and rebuild the indices but the result is the same.

The followings are the query plans

0
Stefan Cronje  Feb 12, 2023 to Nicky Zhu

If you want to use the group by, then you should probably do the count where the group by is being done, and use distinct by as you had it.

select distinct by(serial,hq) hq, count(1)
from thetable
group by hq

If you want it per that grouping.

There are no filters in it, so it is going to do a full table scan and compare and calculate values for each row. Taking the amount of time it takes, it is actually fast.

Maybe look into Bitslice indexes. It might help but at a cost of performance on insert and update:
InterSystems Documentation

0
Nicky Zhu  Feb 12, 2023 to Stefan Cronje

Hi Stefan.

It is from a random search page with 20+ possible conditions to combine. This case is one of the slowest SQL we find. In other cases, with more conditions to act as filters, the performance is OK.

I'll try whether Bitslice index works for this case.

Thanks.

0
Vitaliy Serdtsev · Feb 13, 2023

There are a couple of points:

  • could you please publish the source code of the class, and not the SQL script? The fact is that the command
    CREATE INDEX BITMAP_INDEX_ST_SEARCH_VISIT_BLOOD_TYPE_CODE ON SQLUser.ST_SEARCH_VISIT (BLOOD_TYPE_CODE)

    creates a regular index, not a bitmap index as expected. I'm afraid my code may not match yours.

  • I created a test class with test data on Caché 2018.x and IRIS 2023.1FT and I have your query executed very quickly. Details below.
 

User.STSEARCHVISIT.cls

/// 
Class User.STSEARCHVISIT Extends %Persistent ClassType = persistent, DdlAllowedFinalProcedureBlockSqlRowIdPrivateSqlTableName ST_SEARCH_VISIT ]
{

Property HQORGCODE As %Library.String(MAXLEN 32) [ SqlColumnNumber = 2, SqlFieldName HQ_ORG_CODE ];

Property HQORGNAME As %Library.String(MAXLEN 32) [ SqlColumnNumber = 3, SqlFieldName HQ_ORG_NAME ];

Property TENANTID As %Library.String(MAXLEN 32) [ SqlColumnNumber = 4, SqlFieldName TENANT_ID ];

Property GROUPPATIENTSN As %Library.String(MAXLEN 32) [ SqlColumnNumber = 5, SqlFieldName GROUP_PATIENT_SN ];

Property ORGPATIENTSN As %Library.String(MAXLEN 32) [ SqlColumnNumber = 6, SqlFieldName ORG_PATIENT_SN ];

Property NAME As %Library.String(MAXLEN 64) [ SqlColumnNumber = 7 ];

Property SEXCODE As %Library.String(MAXLEN 2) [ SqlColumnNumber = 8, SqlFieldName SEX_CODE ];

Property SEXNAME As %Library.String(MAXLEN 16) [ SqlColumnNumber = 9, SqlFieldName SEX_NAME ];

Property BIRTHDATE As %Library.Date SqlColumnNumber = 10, SqlFieldName BIRTH_DATE ];

Property MARRYCODE As %Library.String(MAXLEN 2) [ SqlColumnNumber = 11, SqlFieldName MARRY_CODE ];

Property MARRYNAME As %Library.String(MAXLEN 16) [ SqlColumnNumber = 12, SqlFieldName MARRY_NAME ];

Property IDENTIFYTYPECODE As %Library.String(MAXLEN 5) [ SqlColumnNumber = 13, SqlFieldName IDENTIFY_TYPE_CODE ];

Property IDENTIFYTYPENAME As %Library.String(MAXLEN 32) [ SqlColumnNumber = 14, SqlFieldName IDENTIFY_TYPE_NAME ];

Property IDENTIFYNUMBER As %Library.String(MAXLEN 50) [ SqlColumnNumber = 15, SqlFieldName IDENTIFY_NUMBER ];

Property BLOODTYPECODE As %Library.String(MAXLEN 2) [ SqlColumnNumber = 16, SqlFieldName BLOOD_TYPE_CODE ];

Property BLOODTYPENAME As %Library.String(MAXLEN 12) [ SqlColumnNumber = 17, SqlFieldName BLOOD_TYPE_NAME ];

Property MOBILE As %Library.String(MAXLEN 50) [ SqlColumnNumber = 18 ];

Property MAILINGADDRESS As %Library.String(MAXLEN 127) [ SqlColumnNumber = 19, SqlFieldName MAILING_ADDRESS ];

Property VISITSERIALNO As %Library.String(MAXLEN 64) [ SqlColumnNumber = 20, SqlFieldName VISIT_SERIAL_NO ];

Property TABLEFLAG As %Library.String(MAXLEN 1) [ SqlColumnNumber = 21, SqlFieldName TABLE_FLAG ];

Property VISITTYPECODE As %Library.String(MAXLEN 24) [ SqlColumnNumber = 22, SqlFieldName VISIT_TYPE_CODE ];

Property VISITTYPENAME As %Library.String(MAXLEN 64) [ SqlColumnNumber = 23, SqlFieldName VISIT_TYPE_NAME ];

Property VISITDEPTCODE As %Library.String(MAXLEN 32) [ SqlColumnNumber = 24, SqlFieldName VISIT_DEPT_CODE ];

Property VISITDEPTNAME As %Library.String(MAXLEN 64) [ SqlColumnNumber = 25, SqlFieldName VISIT_DEPT_NAME ];

Property INOUTVISITNO As %Library.String(MAXLEN 32) [ SqlColumnNumber = 26, SqlFieldName IN_OUT_VISIT_NO ];

Property VISITTIME As %Library.PosixTime SqlColumnNumber = 27, SqlFieldName VISIT_TIME ];

Property DISCHARGETIME As %Library.PosixTime SqlColumnNumber = 28, SqlFieldName DISCHARGE_TIME ];

Property INHOSPITALTIME As %Library.PosixTime SqlColumnNumber = 29, SqlFieldName IN_HOSPITAL_TIME ];

Property ICDCODE As %Library.String(MAXLEN 20) [ SqlColumnNumber = 30, SqlFieldName ICD_CODE ];

Property ICDNAME As %Library.String(MAXLEN 64) [ SqlColumnNumber = 31, SqlFieldName ICD_NAME ];

Property HEALTHCARDNO As %Library.String(MAXLEN 32) [ SqlColumnNumber = 32, SqlFieldName HEALTH_CARD_NO ];

Property HEALTHCARDTYPE As %Library.String(MAXLEN 64) [ SqlColumnNumber = 33, SqlFieldName HEALTH_CARD_TYPE ];

Property SURGCODE As %Library.String(MAXLEN 20) [ SqlColumnNumber = 34, SqlFieldName SURG_CODE ];

Property SURGNAME As %Library.String(MAXLEN 64) [ SqlColumnNumber = 35, SqlFieldName SURG_NAME ];

Property AGE As %Library.String(MAXLEN 10) [ SqlColumnNumber = 36 ];

Property DISEASECODE As %Library.Integer(MAXVAL 2147483647MINVAL -2147483648) [ SqlColumnNumber = 37, SqlFieldName DISEASE_CODE ];

Property DISEASENAME As %Library.String(MAXLEN 64) [ SqlColumnNumber = 38, SqlFieldName DISEASE_NAME ];

Property LastChangedTime As %Library.PosixTime SqlColumnNumber = 39 ];

Property LastCreateTime As %Library.PosixTime SqlColumnNumber = 40 ];

Property GROUPORG As %Library.String(MAXLEN 50) [ SqlColumnNumber = 41 ];

Parameter USEEXTENTSET = 0;

/// Bitmap Extent Index auto-generated by DDL CREATE TABLE statement.  Do not edit the SqlName of this index.
Index DDLBEIndex [ ExtentSqlName "%%DDLBEIndex"Type = bitmap ];

Index BITMAPINDEXSTSEARCHVISITBLOODTYPECODE On BLOODTYPECODE [ SqlName BITMAP_INDEX_ST_SEARCH_VISIT_BLOOD_TYPE_CODEType = bitmap ];

Index BITMAPINDEXSTSEARCHVISITHQORGCODE On HQORGCODE [ SqlName BITMAP_INDEX_ST_SEARCH_VISIT_HQ_ORG_CODEType = bitmap ];

Index BITMAPINDEXSTSEARCHVISITIDENTIFYTYPECODE On IDENTIFYTYPECODE [ SqlName BITMAP_INDEX_ST_SEARCH_VISIT_IDENTIFY_TYPE_CODEType = bitmap ];

Index BITMAPINDEXSTSEARCHVISITMARRYCODE On MARRYCODE [ SqlName BITMAP_INDEX_ST_SEARCH_VISIT_MARRY_CODEType = bitmap ];

Index BITMAPINDEXSTSEARCHVISITSEXCODE On SEXCODE [ SqlName BITMAP_INDEX_ST_SEARCH_VISIT_SEX_CODEType = bitmap ];

Index BITMAPINDEXSTSEARCHVISITTABLEFLAG On TABLEFLAG [ SqlName BITMAP_INDEX_ST_SEARCH_VISIT_TABLE_FLAGType = bitmap ];

Index BITMAPINDEXSTSEARCHVISITVISITDEPTCODE On VISITDEPTCODE [ SqlName BITMAP_INDEX_ST_SEARCH_VISIT_VISIT_DEPT_CODEType = bitmap ];

Index BITMAPINDEXSTSEARCHVISITVISITTYPECODE On VISITTYPECODE [ SqlName BITMAP_INDEX_ST_SEARCH_VISIT_VISIT_TYPE_CODEType = bitmap ];

Index INDEXSTSEARCHVISITBIRTHDATE On BIRTHDATE [ SqlName INDEX_ST_SEARCH_VISIT_BIRTH_DATEType = index ];

Index INDEXSTSEARCHVISITDISCHARGETIME On DISCHARGETIME [ SqlName INDEX_ST_SEARCH_VISIT_DISCHARGE_TIMEType = index ];

Index INDEXSTSEARCHVISITGROUPPATIENTSN On GROUPPATIENTSN [ SqlName INDEX_ST_SEARCH_VISIT_GROUP_PATIENT_SNType = index ];

Index INDEXSTSEARCHVISITIDENTIFYNUMBER On IDENTIFYNUMBER [ SqlName INDEX_ST_SEARCH_VISIT_IDENTIFY_NUMBERType = index ];

Index INDEXSTSEARCHVISITINHOSPITALTIME On INHOSPITALTIME [ SqlName INDEX_ST_SEARCH_VISIT_IN_HOSPITAL_TIMEType = index ];

Index INDEXSTSEARCHVISITINOUTVISITNO On INOUTVISITNO [ SqlName INDEX_ST_SEARCH_VISIT_IN_OUT_VISIT_NOType = index ];

Index INDEXSTSEARCHVISITMOBILE On MOBILE [ SqlName INDEX_ST_SEARCH_VISIT_MOBILEType = index ];

Index INDEXSTSEARCHVISITNAME On NAME [ SqlName INDEX_ST_SEARCH_VISIT_NAMEType = index ];

Index INDEXSTSEARCHVISITORGPATIENTSN On ORGPATIENTSN [ SqlName INDEX_ST_SEARCH_VISIT_ORG_PATIENT_SNType = index ];

Index INDEXSTSEARCHVISITVISITSERIALNO On VISITSERIALNO [ SqlName INDEX_ST_SEARCH_VISIT_VISIT_SERIAL_NOType = index ];

Index INDEXSTSEARCHVISITVISITTIME On VISITTIME [ SqlName INDEX_ST_SEARCH_VISIT_VISIT_TIMEType = index ];

Index IdxMapOrg On HQORGCODE [ SqlName IdxMapOrgType = index ];

Index LastChangedTimeIndex On LastChangedTime [ SqlName LastChangedTimeIndexType = index ];

Index LastCreateTimeIndex On LastCreateTime [ SqlName LastCreateTimeIndexType = index ];

Index idxGO On GROUPORG [ SqlName idxGOType = index ];

Index INDEXPIPATIENTINFOHQORGCODE On (HQORGCODE, GROUPPATIENTSN) [ SqlName INDEX_PI_PATIENT_INFO_HQ_ORG_CODEType = index ];

Index IdxGpsnorg On (GROUPPATIENTSN, HQORGCODE) [ SqlName IdxGpsnorgType = index ];

Index IdxVisitorg On (VISITSERIALNO, HQORGCODE) [ SqlName IdxVisitorgType = index ];

ClassMethod Fill(50)
{
  
  ; bulk insert via SQL. This is convenient if USEEXTENTSET = 1
  /*
  d ..%KillExtent()
  f i=1:1:N &sql(insert %NOLOCK %NOCHECK %NOINDEX %NOTRIGGER %NOJOURN into ST_SEARCH_VISIT(HQ_ORG_CODE,VISIT_SERIAL_NO)values(:i#4,:i#5))
  */
  
  ; bulk insert via globals
  DISABLE^%NOJRN
  
  ^User.STSEARCHVISITD,^User.STSEARCHVISITI

  i=1:1:{
    s $li(^User.STSEARCHVISITD(i),1)=i#4, ;HQ_ORG_CODE
      $li(^User.STSEARCHVISITD(i),19)=i#5 ;VISIT_SERIAL_NO
  }
  ^User.STSEARCHVISITD=N
  
  ENABLE^%NOJRN
  
  
  "insert - OK!",!
  ..%BuildIndices(,,,$$$NO)
  "BuildIndices - OK!",!
  
  d $system.SQL.TuneTable("ST_SEARCH_VISIT",$$$YES)
  d $system.OBJ.Compile($classname(),"cu-d")
  "OK",!
}

}

Generating 6000000 records

USER>##class(User.STSEARCHVISIT).Fill(6e6)
OK
explain select count(distinct by(VISIT_SERIAL_NO,HQ_ORG_CODE) 1) from ST_SEARCH_VISIT

<plans>
  <plan>
    <sql>
      select count(distinct by(VISIT_SERIAL_NO,HQ_ORG_CODE) 1) from ST_SEARCH_VISIT /*#OPTIONS {"DynamicSQL":1} */
    </sql>
    <cost value="29794316"/>
    Call module B. Output the row.
    <module name="B" top="1">
      Read index map ST_SEARCH_VISIT.IdxVisitorg, looping on %SQLUPPER(VISIT_SERIAL_NO), %SQLUPPER(HQ_ORG_CODE), and ID. For each row: Check distinct values for %SQLUPPER(HQ_ORG_CODE) and %SQLUPPER(VISIT_SERIAL_NO) using a temp-file. For each distinct row: Accumulate the count([value]).
    </module>
  </plan>
</plans>

select count(distinct by(VISIT_SERIAL_NO,HQ_ORG_CODE) 1) from ST_SEARCH_VISIT

Row count: 1 Performance: 2.8704 seconds 6003078 global references 36016391 lines executed

explain select count(1) from (select distinct by (VISIT_SERIAL_NO,HQ_ORG_CODE) VISIT_SERIAL_NO,HQ_ORG_CODE  from ST_SEARCH_VISIT)

<plans>
  <plan>
    <sql>
      select count(1) from (select distinct by (VISIT_SERIAL_NO,HQ_ORG_CODE) VISIT_SERIAL_NO,HQ_ORG_CODE from ST_SEARCH_VISIT) /*#OPTIONS {"DynamicSQL":1} */
    </sql>
    <cost value="21000"/>
    Call module B. Output the row.
    <module name="B" top="1">
      Read index map ST_SEARCH_VISIT.IdxVisitorg, looping on %SQLUPPER(VISIT_SERIAL_NO) and %SQLUPPER(HQ_ORG_CODE). For each row: Call module I. Accumulate the count([value]).
    </module>
    <module name="I" top="1">
      Read index map ST_SEARCH_VISIT.IdxVisitorg, using the given %SQLUPPER(VISIT_SERIAL_NO) and %SQLUPPER(HQ_ORG_CODE), and looping on ID. For each row: Accumulate the [value].
    </module>
  </plan>
</plans>

select count(1) from (select distinct by (VISIT_SERIAL_NO,HQ_ORG_CODE) VISIT_SERIAL_NO,HQ_ORG_CODE from ST_SEARCH_VISIT)

Row count: 1 Performance: 0.0269 seconds 3079 global references 17711 lines executed

0
Nicky Zhu  Feb 14, 2023 to Vitaliy Serdtsev

Hi Vitaliy,

Yes Benjamin also find out that the exported DDL is not consistent to the class definition, but users source code is so far controlled by the end user thus we'll need to wait until they can share it with us.

But I'm not sure what happened in your test with the Fill() method as mentioned above.

In the clients env, there are more than 7 million records and the result of count distinct end is 40,000 less than the total number, which means only a very small portion of VISIT_SERIAL_NO,HQ_ORG_CODE is not unique. Thus I used populate facility to generate 6 million data and the result of count distinct is more than 5.99 million. Then the query is slow.

while the distinct by version is much better

Both was run after tune table and rebuild indecies

You can find the class file I used in the test as below, so far I only used too index to handle this case, if any other index is helpful, we may add it back

Class User.STSEARCHVISIT Extends (%Persistent, %Populate) [ ClassType = persistent, DdlAllowed, Final, Owner = {SuperUser}, ProcedureBlock, SqlRowIdPrivate, SqlTableName = ST_SEARCH_VISIT ]
{

Property HQORGCODE As%Library.String(MAXLEN = 32) [ SqlColumnNumber = 2, SqlFieldName = HQ_ORG_CODE ];Property HQORGNAME As%Library.String(MAXLEN = 32) [ SqlColumnNumber = 3, SqlFieldName = HQ_ORG_NAME ];Property TENANTID As%Library.String(MAXLEN = 32) [ SqlColumnNumber = 4, SqlFieldName = TENANT_ID ];Property GROUPPATIENTSN As%Library.String(MAXLEN = 32) [ SqlColumnNumber = 5, SqlFieldName = GROUP_PATIENT_SN ];Property ORGPATIENTSN As%Library.String(MAXLEN = 32) [ SqlColumnNumber = 6, SqlFieldName = ORG_PATIENT_SN ];Property NAME As%Library.String(MAXLEN = 64) [ SqlColumnNumber = 7 ];Property SEXCODE As%Library.String(MAXLEN = 2) [ SqlColumnNumber = 8, SqlFieldName = SEX_CODE ];Property SEXNAME As%Library.String(MAXLEN = 16) [ SqlColumnNumber = 9, SqlFieldName = SEX_NAME ];Property BIRTHDATE As%Library.Date [ SqlColumnNumber = 10, SqlFieldName = BIRTH_DATE ];Property MARRYCODE As%Library.String(MAXLEN = 2) [ SqlColumnNumber = 11, SqlFieldName = MARRY_CODE ];Property MARRYNAME As%Library.String(MAXLEN = 16) [ SqlColumnNumber = 12, SqlFieldName = MARRY_NAME ];Property IDENTIFYTYPECODE As%Library.String(MAXLEN = 5) [ SqlColumnNumber = 13, SqlFieldName = IDENTIFY_TYPE_CODE ];Property IDENTIFYTYPENAME As%Library.String(MAXLEN = 32) [ SqlColumnNumber = 14, SqlFieldName = IDENTIFY_TYPE_NAME ];Property IDENTIFYNUMBER As%Library.String(MAXLEN = 50) [ SqlColumnNumber = 15, SqlFieldName = IDENTIFY_NUMBER ];Property BLOODTYPECODE As%Library.String(MAXLEN = 2) [ SqlColumnNumber = 16, SqlFieldName = BLOOD_TYPE_CODE ];Property BLOODTYPENAME As%Library.String(MAXLEN = 12) [ SqlColumnNumber = 17, SqlFieldName = BLOOD_TYPE_NAME ];Property MOBILE As%Library.String(MAXLEN = 50) [ SqlColumnNumber = 18 ];Property MAILINGADDRESS As%Library.String(MAXLEN = 127) [ SqlColumnNumber = 19, SqlFieldName = MAILING_ADDRESS ];Property VISITSERIALNO As%Library.String(MAXLEN = 64) [ SqlColumnNumber = 20, SqlFieldName = VISIT_SERIAL_NO ];Property TABLEFLAG As%Library.String(MAXLEN = 1) [ SqlColumnNumber = 21, SqlFieldName = TABLE_FLAG ];Property VISITTYPECODE As%Library.String(MAXLEN = 24) [ SqlColumnNumber = 22, SqlFieldName = VISIT_TYPE_CODE ];Property VISITTYPENAME As%Library.String(MAXLEN = 64) [ SqlColumnNumber = 23, SqlFieldName = VISIT_TYPE_NAME ];Property VISITDEPTCODE As%Library.String(MAXLEN = 32) [ SqlColumnNumber = 24, SqlFieldName = VISIT_DEPT_CODE ];Property VISITDEPTNAME As%Library.String(MAXLEN = 64) [ SqlColumnNumber = 25, SqlFieldName = VISIT_DEPT_NAME ];Property INOUTVISITNO As%Library.String(MAXLEN = 32) [ SqlColumnNumber = 26, SqlFieldName = IN_OUT_VISIT_NO ];Property VISITTIME As%Library.TimeStamp [ SqlColumnNumber = 27, SqlFieldName = VISIT_TIME ];Property DISCHARGETIME As%Library.TimeStamp [ SqlColumnNumber = 28, SqlFieldName = DISCHARGE_TIME ];Property INHOSPITALTIME As%Library.TimeStamp [ SqlColumnNumber = 29, SqlFieldName = IN_HOSPITAL_TIME ];Property ICDCODE As%Library.String(MAXLEN = 20) [ SqlColumnNumber = 30, SqlFieldName = ICD_CODE ];Property ICDNAME As%Library.String(MAXLEN = 64) [ SqlColumnNumber = 31, SqlFieldName = ICD_NAME ];Property HEALTHCARDNO As%Library.String(MAXLEN = 32) [ SqlColumnNumber = 32, SqlFieldName = HEALTH_CARD_NO ];Property HEALTHCARDTYPE As%Library.String(MAXLEN = 64) [ SqlColumnNumber = 33, SqlFieldName = HEALTH_CARD_TYPE ];Property SURGCODE As%Library.String(MAXLEN = 20) [ SqlColumnNumber = 34, SqlFieldName = SURG_CODE ];Property SURGNAME As%Library.String(MAXLEN = 64) [ SqlColumnNumber = 35, SqlFieldName = SURG_NAME ];Property AGE As%Library.String(MAXLEN = 10) [ SqlColumnNumber = 36 ];Property DISEASECODE As%Library.Integer(MAXVAL = 2147483647, MINVAL = -2147483648) [ SqlColumnNumber = 37, SqlFieldName = DISEASE_CODE ];Property DISEASENAME As%Library.String(MAXLEN = 64) [ SqlColumnNumber = 38, SqlFieldName = DISEASE_NAME ];Property LastChangedTime As%Library.TimeStamp [ SqlColumnNumber = 39 ];Property LastCreateTime As%Library.TimeStamp [ SqlColumnNumber = 40 ];Property GROUPORG As%Library.String(MAXLEN = 50) [ SqlColumnNumber = 41 ];Parameter USEEXTENTSET = 1;/// Bitmap Extent Index auto-generated by DDL CREATE TABLE statement.  Do not edit the SqlName of this index.
Index DDLBEIndex [ Extent, SqlName = "%%DDLBEIndex", Type = bitmap ];
Index IdxMapOrg On HQORGCODE [ SqlName = IdxMapOrg, Type = index ];
Index IdxVisitorg On (VISITSERIALNO, HQORGCODE) [ SqlName = IdxVisitorg, Type = index ];
}
0
Nicky Zhu · Feb 14, 2023

Thanks for the help from @Stefan Cronje and @Vitaliy Serdtsev 
I'm using the SQL

select count(1) from (

select distinct by (VISIT_SERIAL_NO,HQ_ORG_CODE) VISIT_SERIAL_NO,HQ_ORG_CODE

from %PARALLEL ST_SEARCH_VISIT

)

which ends in 2 seconds in my test env

Since the end user is expecting a query ends in 3 seconds, I'll test its performance with their other conditions combined.

Thanks again for the help.

0