| From: | Richard Guo <guofenglinux(at)gmail(dot)com> |
|---|---|
| To: | Isaac Morland <isaac(dot)morland(at)gmail(dot)com> |
| Cc: | Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Reduce LEFT/FULL JOIN to ANTI JOIN in more cases |
| Date: | 2026-06-05 07:50:13 |
| Message-ID: | CAMbWs4_xBUhjimekdPNnVu-m-71C5C_nt8tz6ZeJAYehtaXy7Q@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Fri, Jun 5, 2026 at 12:00 PM Isaac Morland <isaac(dot)morland(at)gmail(dot)com> wrote:
> On Thu, 4 Jun 2026 at 22:55, Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
>> reduce_outer_joins() already recognizes that a LEFT JOIN is really an
>> anti-join when an upper qual forces a nullable-side Var to be NULL
>> while that Var is actually non-null in every matching row. In that
>> case only the null-extended (unmatched) rows can satisfy the upper
>> qual, which is exactly anti-join semantics, so we switch JOIN_LEFT to
>> JOIN_ANTI. This is worth detecting because an anti-join is usually
>> much cheaper than computing the outer join and then filtering the
>> result with the IS NULL clause.
> While you’re at it, any chance of changing it so that "a LEFT JOIN b WHERE b IS NULL" is guaranteed not to have worse performance than " a LEFT JOIN b WHERE b.f IS NULL"? I've had this and asked about it here and it was suggested that I should specify the primary key fields of b; but surely if I want an antijoin it's clearer and better implicit documentation if I don't zero in on specific fields. I think somebody else is suggesting "a LEFT ANTI JOIN b" which would be even better.
The reason the planner doesn't recognize "WHERE b IS NULL" during
outer-join reduction is that find_forced_null_var() skips NullTest
nodes with argisrow=true, which is what the parser produces for a
whole-row IS NULL test. So the anti-join reduction never sees it.
To handle it we'd need to decompose the whole-row test into per-column
checks, and that expansion needs the relation's column list, which
find_forced_null_var() doesn't have. One option would be to let
find_forced_null_var() recognize the whole-row IS NULL and hand back
the whole-row Var, then have reduce_outer_joins() expand it into the
relation's columns before running the non-null proof.
That said, I'm not sure it's worth the churn. IMHO, the whole-row IS
NULL test seems less common than the column form. If you want an
anti-join, it seems more natural to just write NOT EXISTS. For the
same reason, I'm not convinced a "LEFT ANTI JOIN" would buy us much
over NOT EXISTS.
- Richard
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Yugo Nagata | 2026-06-05 07:56:41 | Rename EXISTS-to-ANY converted subplan to exists_to_any |
| Previous Message | Chao Li | 2026-06-05 07:48:00 | Fix domain fast defaults on empty tables |