Top from the group SQL query in Caché

Question

Question

Evgeny Shvarov · Jul 8, 2016

Hi!
There is interesting question in Stackoverflow.
This query works for MS SQL:

SELECT * 
FROM 
(
    SELECT *, ROW_NUMBER() OVER (PARTITION BY client_id ORDER BY date_updated DESC) AS rn
    FROM client_address
) a
WHERE a.rn = 1

There is one answer there, which solves the task:

SELECT *
FROM 
   client_address a
   LEFT JOIN client_address b
   on a.client_id  = b.client_id 
     and a.date_updated < b.date_updated 
WHERE
   b.client_id is null

But it goes with comment, that there are no Window functions in Caché SQL.

I wonder, are there any other options to make this query in Caché?

Discussion (5)0

Log in or sign up to continue

Timothy Leavitt · Jul 10, 2016

This is a good use case for HAVING. Using Sample.Person as an example, the following queries are equivalent:

SELECT a.Age,A.Name
FROM 
   Sample.Person a
   LEFT JOIN Sample.Person b
   on a.home_state = b.home_state
     and a.DOB < b.DOB
WHERE
   b.ID is null

Implemented with HAVING:

SELECT Age,Name
FROM Sample.Person
GROUP BY Home_State
HAVING Age = MIN(Age)

The second is much more intuitive, and runs significantly faster too.

I suspect your query would be expressed as:

SELECT *
FROM client_address
GROUP BY client_id
HAVING date_updated = MIN(date_updated)

1 0

score 2 · Answer 1 · 2016-07-11T05:20:29-04:00

As far as I understand such usage of GROUP BY and select columns is not standard in SQL world.

That is -- if we GROUP BY some fields, then in SELECT list we can have either fields we group by or other fields as arguments of aggregate functions.

We can write

SELECT home_state, max(age)
FROM sample.person
GROUP BY home_state

But we cannot write

SELECT home_state, max(age), name
FROM sample.person
GROUP BY home_state

It is not clear -- what name out of all rows that have the same home_state do we want.

Consider following data in Sample.Person:

Age Name Home_State
10  John MA
10  Jim  MA

What name will following query return John or Jim?

SELECT Age,Name
FROM Sample.Person
GROUP BY Home_State
HAVING Age = MIN(Age)

I prefer following variant of the query with join:

SELECT Age,Name,home_state
FROM Sample.Person p1
WHERE age =  (
    SELECT min(age)
    FROM Sample.Person p2
    WHERE p1.home_state = p2.home_state)

That returns both rows from the sample data above.

score 1 · Answer 2 · 2016-07-11T09:06:42-04:00

Timothy Leavitt · Jul 11, 2016

Good point!

1 0

score 0 · Answer 3 · 2016-07-12T14:17:59-04:00

Regarding this

WHERE age = ( SELECT min(age) FROM Sample.Person p2 WHERE p1.home_state = p2.home_state)

Some DBs don't respond well to X= {statement that returns more than one row } how would cache respond to two people that have are tied for being the youngest in a state

score 1 · Answer 4 · 2016-07-12T14:39:35-04:00

Conrad,

your question have two parts

a) For queries like

WHERE a = (select ...)

if subquery returns more than one row, Caché will compare left part with first value of subquery.

b) If several people are tied for the minimum age, then following query prints all of them

SELECT Age,Name,home_state
FROM Sample.Person p1
WHERE age =  (
    SELECT min(age)
    FROM Sample.Person p2
    WHERE p1.home_state = p2.home_state)