Re: 回复:Re: Is it worth pushing conditions to sublink/subplan?

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: 曾文旌(义从) <wenjing(dot)zwj(at)alibaba-inc(dot)com>, shawn wang <shawn(dot)wang(dot)pg(at)gmail(dot)com>, "ggysxcq(at)gmail(dot)com" <ggysxcq(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: wjzeng <wjzeng2012(at)gmail(dot)com>
Subject: Re: 回复:Re: Is it worth pushing conditions to sublink/subplan?
Date: 2021-12-08 03:26:28
Message-ID: 0794833d-5793-e763-6b2c-43638584bfee@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 12/7/21 10:44, 曾文旌(义从) wrote:
> Hi Hackers
>
> For my previous proposal, I developed a prototype and passed
> regression testing. It works similarly to subquery's qual pushdown.
> We know that sublink expands at the beginning of each level of
> query. At this stage, The query's conditions and equivalence classes
> are not processed. But after generate_base_implied_equalities the
> conditions are processed, which is why qual can push down to
> subquery but sublink not.
>
> My POC implementation chose to delay the sublink expansion in the
> SELECT clause (targetList) and where clause. Specifically, it is
> delayed after generate_base_implied_equalities. Thus, the equivalent
> conditions already established in the Up level query can be easily
> obtained in the sublink expansion process (make_subplan).
>
> For example, if the up level query has a.id = 10 and the sublink
> query has a.id = b.id, then we get b.id = 10 and push it down to the
> sublink quey. If b is a partitioned table and is partitioned by id,
> then a large number of unrelated subpartitions are pruned out, This
> optimizes a significant amount of Planner and SQL execution time,
> especially if the partitioned table has a large number of
> subpartitions and is what I want.
>
> Currently, There were two SQL failures in the regression test,
> because the expansion order of sublink was changed, which did not
> affect the execution result of SQL.
>
> Look forward to your suggestions on this proposal.
>

I took a quick look, and while I don't see / can't think of any problems
with delaying it until after generating implied equalities, there seems
to be a number of gaps.

1) Are there any regression tests exercising this modified behavior?
Maybe there are, but if the only changes are due to change in order of
targetlist entries, that doesn't seem like a clear proof.

It'd be good to add a couple tests exercising both the positive and
negative case (i.e. when we can and can't pushdown a qual).

2) apparently, contrib/postgres_fdw does crash like this:

#3 0x000000000077b412 in adjust_appendrel_attrs_mutator
(node=0x13f7ea0, context=0x7fffc3351b30) at appendinfo.c:470
470 Assert(!IsA(node, SubLink));
(gdb) p node
$1 = (Node *) 0x13f7ea0
(gdb) p *node
$2 = {type = T_SubLink}

Backtrace attached.

3) various parts of the patch really need at least some comments, like:

- try_push_outer_qual_to_sublink_query really needs some docs

- new stuff at the end of initsplan.c

4) generate_base_implied_equalities

shouldn't this

if (ec->ec_processed)
;

really be?

if (ec->ec_processed)
continue;

5) I'm not sure why we need the new ec_processed flag.

6) So we now have lazy_process_sublink callback? Does that mean we
expand sublinks in two places - sometimes lazily, sometimes not?

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
backtrace.txt text/plain 5.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-12-08 03:51:12 Re: parse_subscription_options - suggested improvements
Previous Message houzj.fnst@fujitsu.com 2021-12-08 03:22:16 RE: parallel vacuum comments