Re: A new strategy for pull-up correlated ANY_SUBLINK

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: A new strategy for pull-up correlated ANY_SUBLINK
Date: 2022-11-12 22:45:43
Message-ID: 247851.1668293143@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> writes:
> In the past we pull-up the ANY-sublink with 2 steps, the first step is to
> pull up the sublink as a subquery, and the next step is to pull up the
> subquery if it is allowed. The benefits of this method are obvious,
> pulling up the subquery has more requirements, even if we can just finish
> the first step, we still get huge benefits. However the bad stuff happens
> if varlevelsup = 1 involves, things fail at step 1.

> convert_ANY_sublink_to_join ...

> if (contain_vars_of_level((Node *) subselect, 1))
> return NULL;

> In this patch we distinguish the above case and try to pull-up it within
> one step if it is helpful, It looks to me that what we need to do is just
> transform it to EXIST-SUBLINK.

This patch seems awfully messy to me. The fact that you're having to
duplicate stuff done elsewhere suggests at the least that you've not
plugged the code into the best place.

Looking again at that contain_vars_of_level restriction, I think the
reason for it was just to avoid making a FROM subquery that has outer
references, and the reason we needed to avoid that was merely that we
didn't have LATERAL at the time. So I experimented with the attached.
It seems to work, in that we don't get wrong answers from any of the
small number of places that are affected. (I wonder though whether
those test cases still test what they were intended to, particularly
the postgres_fdw one. We might have to try to hack them some more
to not get affected by this optimization.) Could do with more test
cases, no doubt.

One thing I'm not at all clear about is whether we need to restrict
the optimization so that it doesn't occur if the subquery contains
outer references falling outside available_rels. I think that that
case is covered by is_simple_subquery() deciding later to not pull up
the subquery based on LATERAL restrictions, but maybe that misses
something.

I'm also wondering whether the similar restriction in
convert_EXISTS_sublink_to_join could be removed similarly.
In this light it was a mistake for convert_EXISTS_sublink_to_join
to manage the pullup itself rather than doing it in the two-step
fashion that convert_ANY_sublink_to_join does it.

regards, tom lane

Attachment Content-Type Size
v2-0001-use-LATERAL-for-ANY_SUBLINK.patch text/x-diff 6.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2022-11-13 00:47:41 Re: PGDOCS - Logical replication GUCs - added some xrefs
Previous Message Andrey Borodin 2022-11-12 22:01:50 Re: Use fadvise in wal replay