Re: How does postgres store the join predicate for a relation in a given query

From: Gourav Kumar <gourav1905(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: How does postgres store the join predicate for a relation in a given query
Date: 2017-10-13 21:45:02
Message-ID: CAPzqDmi5qD4XG=ERj_R4U=AjsoZf5HxEzzUMaaCiZggh_H_HJg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Why does have_relevant_joinclause() and have_relevant_eclass_joinclause()
return true for all possible joins for the query given below.
Even when they have no join predicate between them.
e.g. join between ss1 & ws3, ss2 & ws3 etc.

The query is :
TPC-DS query 50

-- query 50 in stream 0 using template query31.tpl
with ss as
(select ca_county,d_qoy, d_year,sum(ss_ext_sales_price) as store_sales
from store_sales,date_dim,customer_address
where ss_sold_date_sk = d_date_sk
and ss_addr_sk=ca_address_sk
group by ca_county,d_qoy, d_year),
ws as
(select ca_county,d_qoy, d_year,sum(ws_ext_sales_price) as web_sales
from web_sales,date_dim,customer_address
where ws_sold_date_sk = d_date_sk
and ws_bill_addr_sk=ca_address_sk
group by ca_county,d_qoy, d_year)
select /* tt */
ss1.ca_county
,ss1.d_year
,ws2.web_sales/ws1.web_sales web_q1_q2_increase
,ss2.store_sales/ss1.store_sales store_q1_q2_increase
,ws3.web_sales/ws2.web_sales web_q2_q3_increase
,ss3.store_sales/ss2.store_sales store_q2_q3_increase
from
ss ss1
,ss ss2
,ss ss3
,ws ws1
,ws ws2
,ws ws3
where
ss1.d_qoy = 1
and ss1.d_year = 2000
and ss1.ca_county = ss2.ca_county
and ss2.d_qoy = 2
and ss2.d_year = 2000
and ss2.ca_county = ss3.ca_county
and ss3.d_qoy = 3
and ss3.d_year = 2000
and ss1.ca_county = ws1.ca_county
and ws1.d_qoy = 1
and ws1.d_year = 2000
and ws1.ca_county = ws2.ca_county
and ws2.d_qoy = 2
and ws2.d_year = 2000
and ws1.ca_county = ws3.ca_county
and ws3.d_qoy = 3
and ws3.d_year =2000
and case when ws1.web_sales > 0 then ws2.web_sales/ws1.web_sales else
null end
> case when ss1.store_sales > 0 then ss2.store_sales/ss1.store_sales
else null end
and case when ws2.web_sales > 0 then ws3.web_sales/ws2.web_sales else
null end
> case when ss2.store_sales > 0 then ss3.store_sales/ss2.store_sales
else null end
order by web_q2_q3_increase;

-- end

On 13 October 2017 at 01:00, Gourav Kumar <gourav1905(at)gmail(dot)com> wrote:

> Well for this given query it is possible. I haven't come across any such
> query yet.
>
> Possibly because I am more concerned about the TPCDS and TPCH benchmarks,
> where it's less likely to occur.
>
> On 13 October 2017 at 00:52, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> Gourav Kumar <gourav1905(at)gmail(dot)com> writes:
>> > A Join clause/predicate will only mention 2 relations. It can't have 3
>> or
>> > more relations.
>>
>> Really? What of, say,
>>
>> select ... from a,b,c where (a.x + b.y) = c.z;
>>
>> regards, tom lane
>>
>
>
>
> --
> Thanks,
> Gourav Kumar
> Computer Science and Automation
> Indian Institute of Science
>

--
Thanks,
Gourav Kumar
Computer Science and Automation
Indian Institute of Science

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-10-13 21:46:41 Re: Discussion on missing optimizations
Previous Message David Rowley 2017-10-13 21:38:13 Re: Discussion on missing optimizations