Re: A new strategy for pull-up correlated ANY_SUBLINK

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: A new strategy for pull-up correlated ANY_SUBLINK
Date: 2023-01-06 06:16:23
Message-ID: CALDaNm3ZOAzjHeeQN04ZoBdJjxkziGGyJLEDajuDw2bQC2EZ0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 13 Nov 2022 at 04:15, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> writes:
> > In the past we pull-up the ANY-sublink with 2 steps, the first step is to
> > pull up the sublink as a subquery, and the next step is to pull up the
> > subquery if it is allowed. The benefits of this method are obvious,
> > pulling up the subquery has more requirements, even if we can just finish
> > the first step, we still get huge benefits. However the bad stuff happens
> > if varlevelsup = 1 involves, things fail at step 1.
>
> > convert_ANY_sublink_to_join ...
>
> > if (contain_vars_of_level((Node *) subselect, 1))
> > return NULL;
>
> > In this patch we distinguish the above case and try to pull-up it within
> > one step if it is helpful, It looks to me that what we need to do is just
> > transform it to EXIST-SUBLINK.
>
> This patch seems awfully messy to me. The fact that you're having to
> duplicate stuff done elsewhere suggests at the least that you've not
> plugged the code into the best place.
>
> Looking again at that contain_vars_of_level restriction, I think the
> reason for it was just to avoid making a FROM subquery that has outer
> references, and the reason we needed to avoid that was merely that we
> didn't have LATERAL at the time. So I experimented with the attached.
> It seems to work, in that we don't get wrong answers from any of the
> small number of places that are affected. (I wonder though whether
> those test cases still test what they were intended to, particularly
> the postgres_fdw one. We might have to try to hack them some more
> to not get affected by this optimization.) Could do with more test
> cases, no doubt.
>
> One thing I'm not at all clear about is whether we need to restrict
> the optimization so that it doesn't occur if the subquery contains
> outer references falling outside available_rels. I think that that
> case is covered by is_simple_subquery() deciding later to not pull up
> the subquery based on LATERAL restrictions, but maybe that misses
> something.
>
> I'm also wondering whether the similar restriction in
> convert_EXISTS_sublink_to_join could be removed similarly.
> In this light it was a mistake for convert_EXISTS_sublink_to_join
> to manage the pullup itself rather than doing it in the two-step
> fashion that convert_ANY_sublink_to_join does it.

The patch does not apply on top of HEAD as in [1], please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
b82557ecc2ebbf649142740a1c5ce8d19089f620 ===
=== applying patch ./v2-0001-use-LATERAL-for-ANY_SUBLINK.patch
patching file contrib/postgres_fdw/expected/postgres_fdw.out
...
Hunk #2 FAILED at 6074.
Hunk #3 FAILED at 6087.
2 out of 3 hunks FAILED -- saving rejects to file
src/test/regress/expected/join.out.rej

[1] - http://cfbot.cputube.org/patch_41_3941.log

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message shiy.fnst@fujitsu.com 2023-01-06 06:17:29 RE: Force streaming every change in logical decoding
Previous Message Dilip Kumar 2023-01-06 06:13:43 Re: making relfilenodes 56 bits