Mack Altman · Dec 2, 2016

What are best ways to query large tables?

We don't often use SQL within our org, which is mostly due to the performance issue we experience due to the quantity of data we are reviewing.

Aside from the standard performance measures for non-Caché databases, are there any recommended approaches when querying large tables?

The table would have roughly 50M records, but there are not a finite amount of sub-nodes.

0 820
Discussion (7)4
Log in or sign up to continue

Well, if you want good performance in SQL Queries, you need indexes.  What do your tables look like?  We can definitely get fast performance on a 50M row table, it might just take some work (which we're happy to help with).

The class referencing the account global doesn't even finish a COUNT (ex. SELECT COUNT(acctID) FROM namespace.account). The global itself looks like the following, which have multiple parent and child nodes as well as varying numbers of positions for each. I've posted the structure below.



The first question for any SQL performance issue is:  Have you run TuneTable?

It is very important to have correct values for ExtentSize, Selectivity and Block Count, whether the info comes from a developer manually entering it or by running TuneTable.  With out these values the Query Optimizer can only guess what the right plan might be.

The next step is to provide proper indices to support the queries you are going to write.  Without have a lot more info it is impossible to tell you what those might be or what type of index would be best (bitmap or standard).

I ran TuneTable on one table and it seemed rather quick. Following that I used the 'TuneTables' option and its been running in the background for about 2 hours now.

There isn't much available in the 2010.2 docbook about it. Are there any best practices to running it?



Running TuneTable is safe to do at any time.  It will be CPU intensive so you might not want to run it at peak workload times, other than that it is fine to run at any time.


It is important that all your tables have this info so it is great that you are running it on all your tables.

If you have more recent versions of Cache you will likely benefit by using %PARALLEL especially if you have a large number of cores for your environment.

Unfortunately, we are on 2010.1.