Retiring is_pushed_down

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Retiring is_pushed_down
Date: 2023-07-25 07:39:58
Message-ID: CAMbWs49pDTmHbgjXa4xjxbA+pY=7ebZ=YUg1waQqQw8DERgQ-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

When forming an outer join's joinrel, we have the is_pushed_down flag in
RestrictInfo nodes to distinguish those quals that are in that join's
JOIN/ON condition from those that were pushed down to the joinrel and
thus act as filter quals. Since now we have the outer-join-aware-Var
infrastructure, I think we can check to see whether a qual clause's
required_relids reference the outer join(s) being formed, in order to
tell if it's a join or filter clause. This seems like a more principled
way. (Interesting that optimizer/README actually describes this way in
section 'Relation Identification and Qual Clause Placement'.)

So I give it a try to retire is_pushed_down as attached. But there are
several points that may need more thoughts.

* When we form an outer join, it's possible that more than one outer
join relid is added to the join's relid set, if there are any pushed
down outer joins per identity 3. And it's also possible that no outer
join relid is added, for an outer join that has been pushed down. So
instead of checking if a qual clause's required_relids include the outer
join's relid, I think we should check if its required_relids overlap
the outer join relids that are being formed, which means that we should
use bms_overlap(rinfo->required_relids, ojrelids) rather than
bms_is_member(ojrelid, rinfo->required_relids). And we should do this
check only for outer joins.

* This patch calculates the outer join relids that are being formed
generally in this way:

bms_difference(joinrelids, bms_union(outerrelids, innerrelids))

Of course this can only be used after the outer join relids has been
added by add_outer_joins_to_relids(). This calculation is performed
multiple times during planning. I'm not sure if this has performance
issues. Maybe we can calculate it only once and store the result in
some place (such as in JoinPath)?

* ANTI joins are treated as outer joins but sometimes they do not have
rtindex (such as ANTI joins derived from SEMI). This would be a problem
with this new check. As an example, consider query

select * from a where not exists (select 1 from b where a.i = b.i) and
CURRENT_USER = SESSION_USER;

The pseudoconstant clause 'CURRENT_USER = SESSION_USER' is supposed to
be treated as a filter clause but the new check would treat it as a join
clause because the outer join relid set being formed is empty since the
ANTI join here does not have an rtindex. To solve this problem, this
patch manually adds a RTE for a ANTI join derived from SEMI.

Any thoughts?

Thanks
Richard

Attachment Content-Type Size
v1-0001-Retiring-is_pushed_down.patch application/octet-stream 39.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-07-25 07:43:16 Re: WAL Insertion Lock Improvements
Previous Message Michael Paquier 2023-07-25 07:36:42 Re: pg_rewind fails with in-place tablespace