| From: | Richard Guo <rguo(at)postgresql(dot)org> |
|---|---|
| To: | pgsql-committers(at)lists(dot)postgresql(dot)org |
| Subject: | pgsql: Fix HAVING-to-WHERE pushdown with nondeterministic collations |
| Date: | 2026-05-01 02:17:38 |
| Message-ID: | E1wIdRh-003XDL-1X@gemulon.postgresql.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-committers |
Fix HAVING-to-WHERE pushdown with nondeterministic collations
When GROUP BY uses a nondeterministic collation, the planner's
optimization of moving HAVING clauses to WHERE can produce incorrect
query results. The HAVING clause may apply a stricter collation that
distinguishes values the GROUP BY considers equal. Pushing such a
clause to WHERE causes it to filter individual rows before grouping,
potentially eliminating group members and changing aggregate results.
Fix this by detecting collation conflicts before flatten_group_exprs,
while the HAVING clause still contains GROUP Vars (Vars referencing
RTE_GROUP). At that point, each GROUP Var directly carries the GROUP
BY collation as its varcollid, making it straightforward to compare
against the operator's inputcollid. A mismatch where the GROUP BY
collation is nondeterministic means the clause is unsafe to push down.
RowCompareExpr is treated specially, since it carries per-column
inputcollids[] rather than a single inputcollid.
The conflicting clause indices are recorded in a Bitmapset and
consulted during the existing HAVING-to-WHERE loop, so that only
affected clauses are kept in HAVING; other safe clauses in the same
query are still pushed.
Back-patch to v18 only. The fix relies on the RTE_GROUP mechanism
introduced in v18 (commit 247dea89f), which is what lets us identify
grouping expressions and their resolved collations via GROUP Vars on
pre-flatten havingQual. Pre-v18 branches lack that machinery, so a
back-patch there would need a different approach. Given the absence
of field reports of this bug on back branches, the risk of carrying a
different fix on stable branches is not justified.
Author: Richard Guo <guofenglinux(at)gmail(dot)com>
Reviewed-by: wenhui qiu <qiuwenhuifx(at)gmail(dot)com>
Discussion: https://postgr.es/m/CAMbWs48Dn2wW6XM94GZsoyMiH42=KgMo+WcobPKuWvGYnWaPOQ@mail.gmail.com
Backpatch-through: 18
Branch
------
REL_18_STABLE
Details
-------
https://git.postgresql.org/pg/commitdiff/e8fd5e579223f669245a8f7961c71b94afec2307
Modified Files
--------------
src/backend/optimizer/plan/planner.c | 188 ++++++++++++++++++++++
src/test/regress/expected/collate.icu.utf8.out | 213 +++++++++++++++++++++++++
src/test/regress/sql/collate.icu.utf8.sql | 66 ++++++++
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 468 insertions(+)
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Fujii Masao | 2026-05-01 03:14:36 | pgsql: Avoid blocking indefinitely while finishing walsender shutdown |
| Previous Message | Amit Langote | 2026-05-01 01:12:20 | pgsql: Use "concurrent delete" in serialization error for TM_Deleted ca |