Re: Reduce LEFT/FULL JOIN to ANTI JOIN in more cases

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Isaac Morland <isaac(dot)morland(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Reduce LEFT/FULL JOIN to ANTI JOIN in more cases
Date: 2026-06-15 09:08:36
Message-ID: CAMbWs4_zJhF0fyng6SMYVnBvxPK1hhTKcn=FN=Kdnnoa4_DiuA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 5, 2026 at 4:50 PM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> The reason the planner doesn't recognize "WHERE b IS NULL" during
> outer-join reduction is that find_forced_null_var() skips NullTest
> nodes with argisrow=true, which is what the parser produces for a
> whole-row IS NULL test. So the anti-join reduction never sees it.
>
> To handle it we'd need to decompose the whole-row test into per-column
> checks, and that expansion needs the relation's column list, which
> find_forced_null_var() doesn't have. One option would be to let
> find_forced_null_var() recognize the whole-row IS NULL and hand back
> the whole-row Var, then have reduce_outer_joins() expand it into the
> relation's columns before running the non-null proof.

After a second thought, I don't think we need to explicitly expand the
whole-row Var in order to support "WHERE b IS NULL": proving any one
column of b non-null in matching rows is sufficient to treat it as an
anti-join. I tried it and it coped with the existing machinery fairly
cleanly. See 0003.

A subtlety is that a non-null whole-row datum can still have all
columns NULL. So a datum-level proof does not establish that the
columns are non-null, and must be ignored for the whole-row case.
(Consider "t1 left join t2 on t1 = t2 where t2 is null", where t2 has
no NOT NULL column.) Relatedly, a row-format test on a composite-type
column is not handled, since "b.c IS NULL" does not force the column c
null.

While at it, I extended the same idea to "WHERE b IS NOT NULL", which
lets a left join reduce to an inner join. See 0004. The subtlety
there is the reverse: although "IS NOT NULL" being true implies every
field is non-null, we exploit only the weaker "datum is non-null",
since that whole-row fact is shared with datum-level strict contexts
like record comparisons.

0001 and 0002 are as before.

- Richard

Attachment Content-Type Size
v2-0001-Reduce-LEFT-JOIN-to-ANTI-JOIN-using-quals-within-.patch application/octet-stream 18.0 KB
v2-0002-Reduce-FULL-JOIN-to-ANTI-JOIN.patch application/octet-stream 19.5 KB
v2-0003-Reduce-outer-joins-to-anti-joins-for-whole-row-IS.patch application/octet-stream 17.0 KB
v2-0004-Relax-strictness-detection-for-row-format-IS-NOT-.patch application/octet-stream 8.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chao Li 2026-06-15 09:11:11 Re: pg_restore handles extended statistics inconsistently with statistics data
Previous Message Amit Kapila 2026-06-15 09:06:09 Re: Proposal: Conflict log history table for Logical Replication