Question
CM Wang · Dec 27, 2017

the associativity when the INNER JOIN and LEFT JOIN are mixed

select * from A left join B on A.x = B.x inner join C on B.y = C.y

 

does Cache SQL treat the join path: LEFT_JOIN(A,INNER_JOIN(B.C)) also a valid candidate path?

 

Thanks.

 

 

 

0
0 780
Discussion (8)1
Log in or sign up to continue

Yes.

SELECT
p.id, p1.id, p2.id
FROM Sample.Person p
LEFT JOIN Sample.Person p1 ON p1.id=p.id
INNER JOIN Sample.Person p2 ON p2.id=p.id

The JOIN conditions in your sample does not match my original query.

The sample you provide is P.id  = P1.id and P.id = P2.id,

in the previous sample, it should be like P.id = P1.id and P1.id =  P2.id.

Also, you self join on the identical fields "ID", so I guess the engine could infer p1.id = p2.id by "P.id  = P1.id and P.id = P2.id" , this would interfere my intention.

That's also the reason I explicitly specify two different fields "x" and "y" in my sample.  

I am not good at relational theory, but does the associativity really hold when the LEFT and INNER coexist?  

Maybe %NOFLATTEN can help, if we're sure that inner query returns few results.

Also you can try to use IN.

in my sample, it should be like P.id = P1.id and P1.id =  P2.id.

It would work like that too.

Also, you self join on the identical fields "ID", so I guess the engine could infer p1.id = p2.id by "P.id  = P1.id and P.id = P2.id" , this would interfere my intention.

Please consider posting:

  • sample classes
  • sample data
  • description of data you want to get
  • what data you want to get
  • query

You are totally right. 
I tried to stick with the original questions. And I wouldn't formulate it that "traditional" way.
As you noticed there are more efficient and  meaningful constructs possible. 

Try

select * from A 
  left join 
    ( B inner join C on B.y = C.y )
  on A.x = B.x

That should give you only the Bs that have C match on y

My sample took somewhat longer to prepare then Edward's smiley and Gerd's

#1) to evaluate SQL statements use MgmtPortal/ Explorer/SQL and check the generated query plan.

#2) if you don't use special attributes to SELECT clause or sub-queries all SQL statements are strictly worked from left to right.
 your first statement is ok for SQL your 2nd is just a fragment that I interpreted as to be a sub-query

Summary: they are not identical.

Now the example transformed for namespace SAMPLES to have 3 tables as A, B, and C:

select A.id,B.id,c.Id from
    sample.person A left join 
    sample.employee B on A.id = B.spouse 
    inner join sample.company C on B.company = C.id 

 

Row count: 100

Query Plan:

Relative cost = 4200
Read extent bitmap Sample.Employee.$Employee, looping on ID.
For each row:
     Read master map Sample.Employee.IDKEY, using the given idkey value.
    Read extent bitmap Sample.Person.$Person, using the given ID.
 For each row:
     Read master map Sample.Company.IDKEY, using the given idkey value.
 Output the row.
 

transformed to sub-queries:

select aid,bid,cid from
(select A.id aid, b.id bid, b.company cid 
    from sample.person A 
    left join sample.employee B
    on A.id = B.spouse)
 inner join sample.company C 
 on cid = C.id 

it is obviously identic ! just more expressive.

Row count: 100

Query Plan:

Relative cost = 4200
Read extent bitmap Sample.Employee.$Employee, looping on ID.
For each row:
     Read master map Sample.Employee.IDKEY, using the given idkey value.
    Read extent bitmap Sample.Person.$Person, using the given ID.
 For each row:
     Read master map Sample.Company.IDKEY, using the given idkey value.
 Output the row.

the next  LEFT_JOIN(A,INNER_JOIN(B.C))  requires immediate transformation

select A.id,bid,cid from
sample.person A left join
(select b.id as bid,c.id cid, b.spouse bsp 
 from sample.employee B
 inner join sample.company C on B.company = C.id) 
 on A.id = bsp

Row count:  237  (!!!!)

Query Plan:

Relative cost = 32408
Read extent bitmap Sample.Person.$Person, looping on ID.
For each row:
 Call module E.
 Read temp-file A, using the given VIEW column #3, and looping on VIEW counter.
 For each row:
 Generate a row padded with NULL for the view if no row qualified.
 Output the row.
module E
Read extent bitmap Sample.Employee.$Employee, looping on ID.
For each row:
 Read master map Sample.Employee.IDKEY, using the given idkey value.
 Read master map Sample.Company.IDKEY, using the given idkey value.
 Increment view row counter.
 Add a row to temp-file A, subscripted by VIEW column #3 and VIEW counter,
 with node data of VIEW column #1 and VIEW column #2.

So both variants are possible though the result is different

HTH,