Reduce LEFT/FULL JOIN to ANTI JOIN in more cases

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Reduce LEFT/FULL JOIN to ANTI JOIN in more cases
Date: 2026-06-05 02:55:12
Message-ID: CAMbWs49H9khf+1GwyzD0TYaks6cE-OU+1mbUiOn3gZSoBiO7zg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

reduce_outer_joins() already recognizes that a LEFT JOIN is really an
anti-join when an upper qual forces a nullable-side Var to be NULL
while that Var is actually non-null in every matching row. In that
case only the null-extended (unmatched) rows can satisfy the upper
qual, which is exactly anti-join semantics, so we switch JOIN_LEFT to
JOIN_ANTI. This is worth detecting because an anti-join is usually
much cheaper than computing the outer join and then filtering the
result with the IS NULL clause.

Currently we recognize the Var as non-null only if the join's own
clauses are strict for it, or if it is defined NOT NULL. But a strict
qual inside the RHS subtree is just as good a proof, since it holds
for every row the RHS emits. For example,

SELECT * FROM t1 LEFT JOIN
(t2 JOIN t3 ON t2.c = t3.c) ON t1.a = t2.b
WHERE t3.c IS NULL;

is an anti-join (t2.c = t3.c makes t3.c non-null), but we don't spot
it today. 0001 does that.

0002 does the same for FULL joins, where a forced-null Var on either
side is enough: once it is proven non-null, every row on that side
drops out and only the other side's unmatched rows remain. So, with
t2.a defined NOT NULL

SELECT * FROM t1 FULL JOIN t2 ON t1.x = t2.x
WHERE t2.a IS NULL;

or

SELECT * FROM t1 FULL JOIN
(t2 JOIN t3 ON t2.c = t3.c) ON t1.a = t2.b
WHERE t3.c IS NULL;

becomes t1 ANTI t2. But note that we can't reuse the join's own ON
quals as the proof here, the way the LEFT JOIN code does, since they
don't hold for the unmatched rows we need to cover.

Regarding cost: the extra planning work is small. The proving quals
ride along on the pass-1 walk that reduce_outer_joins() already does
for nullable_rels, so there is no new traversal, and the non-null
proof in pass 2 only runs when a join actually has a forced-null Var
sitting above it. Given that an anti-join is usually much cheaper
than a left or full join, it looks like a good trade.

Thoughts?

- Richard

Attachment Content-Type Size
v1-0001-Reduce-LEFT-JOIN-to-ANTI-JOIN-using-quals-within-.patch application/octet-stream 18.0 KB
v1-0002-Reduce-FULL-JOIN-to-ANTI-JOIN.patch application/octet-stream 19.5 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Isaac Morland 2026-06-05 03:00:42 Re: Reduce LEFT/FULL JOIN to ANTI JOIN in more cases
Previous Message Peter Smith 2026-06-05 02:29:04 Re: Proposal: Conflict log history table for Logical Replication