pgsql: Fix "variable not found in subplan target lists" in semijoin de-

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Fix "variable not found in subplan target lists" in semijoin de-
Date: 2025-08-28 17:49:53
Message-ID: E1urgkz-0026la-10@gemulon.postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Fix "variable not found in subplan target lists" in semijoin de-duplication.

One mechanism we have for implementing semi-joins is to de-duplicate
the output of the RHS and then treat the join as a plain inner join.
Initial construction of the join's SpecialJoinInfo identifies the
RHS columns that need to be de-duplicated, but later we may find that
some of those don't need to be handled explicitly, either because
they're known to be constant or because they are redundant with some
previous column.

Up to now, while sort-based de-duplication handled such cases well,
hash-based de-duplication didn't: we'd still hash on all of the
originally-identified columns. This is probably not a very big
deal performance-wise, but in the wake of commit a3179ab69 it can
cause planner errors. That happens when join elimination causes
recalculation of variables' attr_needed bitmapsets, and we decide
that a variable mentioned in a semijoin clause doesn't need to be
propagated up to the join level anymore.

There are a number of ways we could slice the blame for this, but the
only fix that doesn't result in pessimizing plans for loosely-related
cases is to be more careful about not hashing columns we don't
actually need to de-duplicate. We can install that consideration
into create_unique_paths in master, or the predecessor code in
create_unique_path in v18, without much refactoring.

(As follow-up work, it might be a good idea to look at more-invasive
refactoring, in hopes of preventing other bugs in this area. But
with v18 release so close, there's not time for that now, nor would
we be likely to want to put such refactoring into v18 anyway.)

Reported-by: Sergey Soloviev <sergey(dot)soloviev(at)tantorlabs(dot)ru>
Diagnosed-by: Richard Guo <guofenglinux(at)gmail(dot)com>
Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Reviewed-by: Richard Guo <guofenglinux(at)gmail(dot)com>
Discussion: https://postgr.es/m/1fd1a421-4609-4d46-a1af-ab74d5de504a@tantorlabs.ru
Backpatch-through: 18

Branch
------
REL_18_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/3aee6283709f6a7d826b3fb5773cc4496788a5df

Modified Files
--------------
src/backend/optimizer/util/pathnode.c | 155 ++++++++++++++++++++++++++++---
src/test/regress/expected/aggregates.out | 20 ----
src/test/regress/expected/join.out | 80 ++++++++++++++++
src/test/regress/sql/aggregates.sql | 9 --
src/test/regress/sql/join.sql | 33 +++++++
5 files changed, 254 insertions(+), 43 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Masahiko Sawada 2025-08-29 00:07:53 pgsql: Use LW_SHARED in walsummarizer.c for WALSummarizerLock lock wher
Previous Message Álvaro Herrera 2025-08-28 16:19:34 pgsql: Glossary: improve definition of "relation"