How do I force outliers to be ignored - 2016.1

Question

Question

Kevin Furze · May 14, 2016

I have an sql that is refusing to play fairly.

I have

tuned affected tables
made sure I'm on the latest version (2016.1)
got enough global buffers
tried forcing the sql to use outliers correctly

Still nothing. here's the SQL

SELECT id , dealerid -> extnumber AS exNum , dealerid -> deliveryorder AS ord , dealerid -> deliverytown AS town , consTypeID , constypeid -> GroupInitials AS gty , barcode , constypeid , constypeid -> Initials AS ty , constypeid -> MaxDaysInWarehouse AS maxWh , daysInWh , dealerid , statusActual , wgtCurrent , packType -> ShortDesc , jdType , statusCode , mustGo , wgtAccurate FROM sqluser . newconsignments WHERE dealerid -> routeid = 4 AND statusCode IN ( 80,240) AND consolidateID = 0

Its the consolidateID=0 that's the problem, It's a similar issue talked of in https://community.intersystems.com/post/global-summit-2016-find-and-fix-slow-query the issue talked of "closed cases" - in my case, I there are many more values of zero than other values (ratio 500,000:150)

I get a rowcount of 31 and , performance of 4.625 and global counts of just over 1,000,990 using that sql

the plan clearly shows it's dropping down the consolidateID index first, then the dealerid->routeid, then the statusCode but has to drop down 500,000+ consolidateID's and then find the dealers and stautsCodes.

I've looked at the indicies (mostly bitmaps) and the selectivities after tuning tables,

and all the selectivities suggest that the consolidationID (0.0357%) should be used last as the other selectivities are much higher.

I though I would follow the advice and change the wehere clause to AND ((consolidateID = 0)) by using the double brackets but all that did getter a slightly better speed (probably because the data's in the buffers now) and the global count's went down to 1,000,831 (a drop of about 100) and it's still dropping down the consolidateID first (wrongly)

I use this SQL inside a fetch chunk of code and so to make it so much more efficient, I deleted the AND ((consolidateID = 0)) clause and the fetched just the 30-50 rows and ignored those where consolidateID<>0

so of course, the question is:

how do I force the AND ((consolidateID = 0)) to be used as the LAST filter, not the first

kevin

Discussion (5)0

Log in or sign up to continue

Tom Woodfin · May 14, 2016

You have the right idea to 'literalize' the 0 so it isn't treated as a parameter, but the syntax isn't quite right. Try:

AND consolidateID = ((0))

Then the optimizer should know that this condition will be non-selective.

1 0

score 0 · Answer 1 · 2016-05-14T12:02:05-04:00

tom, many thanks

rows=31, Performance: 0.191 seconds 39025 global access

massive difference.

I should look a lot closer at whats in front of me when I look at worked examples.

kev

score 0 · Answer 2 · 2016-05-14T12:20:19-04:00

I've been following the tutorial, and I noticed %PARALLEL as in select %parallel from sqluser.newConsignments

it's forcing multi-cores to be used to solve the sql

i tried that on exactly the same sql above and the first time I ran it the time went back up to over 4 seconds, but when I ran it again (buffered globals) I got even better results.

Row count: 31 Performance: 0.151 seconds 294 global references

looking at the plan clearly showed a different way of fetching the same data.

seems there's some mileage in playing with the sql with subtly different parameters especially if every second counts

score 0 · Answer 3 · 2016-05-14T17:04:54-04:00

Great!

Note that the infrastructure for %PARALLEL is substantially improved in the upcoming 2016.2 release. If your query can be parallelized, you ought to see even better performance there.

score 0 · Answer 4 · 2016-05-16T12:41:15-04:00

Kevin Furze · May 16, 2016

Nice to hear

0 0