Re: Pulling up direct-correlated ANY_SUBLINK

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: Richard Guo <riguo(at)pivotal(dot)io>
Cc: Antonin Houska <ah(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Pulling up direct-correlated ANY_SUBLINK
Date: 2020-08-19 05:55:16
Message-ID: CAKU4AWp8cyP8OsFBqCJF7tnnTPwS_QR=qwcH2_ppBBg+5e90Bg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 17, 2019 at 4:41 PM Richard Guo <riguo(at)pivotal(dot)io> wrote:

>
> On Thu, Sep 12, 2019 at 11:35 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
>
>> Richard Guo <riguo(at)pivotal(dot)io> wrote:
>>
>> > On Wed, Sep 11, 2019 at 3:25 PM Antonin Houska <ah(at)cybertec(dot)at>
>> > wrote:
>> >
>> >
>> > Nevertheless, I don't know how to overcome the problems that I
>> > mentioned
>> > upthread.
>> >
>> >
>> > Do you mean the problem "the WHERE clause of the subquery didn't
>> > participate in the SEMI JOIN evaluation"? Good news is it has been
>> > fixed
>> > by commit 043f6ff0 as I mentioned upthread.
>>
>> Do you say that my old patch (rebased) no longer breaks the regression
>> tests?
>>
>
> I think so.
>
>
>>
>> (I noticed your other email in the thread which seems to indicate that
>> you're
>> no lo longer interested to work on the feature, but asking out of
>> curiosity.)
>>
>
> Tom pointed out that even if we pull up the subquery with the help of
> LATERAL, we cannot make sure we will end up with a better plan, since
> LATERAL pretty much constrains things to use a nestloop. Hmm, I think
> what he said makes sense.
>
> Thanks
> Richard
>
>

Even if we can't do this for the general case, I still think we can do
something
for some special cases, for example:
select count(*) from j1 where (i) *in* (select i from j2 where* j2.im5 =
j1.im5*);
can be converted to
select count(*) from t1 where (i, im5) in (select i, im5 from j2);

The conversion can happen just before the convert_ANY_sublink_to_join.

@@ -399,6 +483,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node
*node,
/* Is it a convertible ANY or EXISTS clause? */
if (sublink->subLinkType == ANY_SUBLINK)
{
+ reduce_sublink_correlation_exprs(root, sublink);
if ((j = convert_ANY_sublink_to_join(root, sublink,

available_rels1)) != NULL)

However we have to do lots of pre checking for this, the below is
something I can think for now.

1). It must be an in-subquery.
2). The op in correlation_expr must be a mergeable op.
3). no aggregation call in subquery->targetList and subquery->havingQual.
4). no limit/offset cause.
5). No volatile function involved for safety.

I can't tell how often it is, I just run into this by my own and search the
maillist and get only 1 report [1]. Is it something worth doing or do we
have
a better strategy to handle it? Thanks!

[1] https://www.postgresql.org/message-id/3691.1342650974@sss.pgh.pa.us

--
Best Regards
Andy Fan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jiří Fejfar 2020-08-19 06:08:42 Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Previous Message Tom Lane 2020-08-19 05:53:07 Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails