| From: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
|---|---|
| To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Subject: | Fix GROUP BY ALL handling of ORDER BY operator semantics |
| Date: | 2026-06-29 07:20:24 |
| Message-ID: | 5243308F-8E5C-45AA-828C-FAD96C4F34DA@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
While testing “[ef38a4d97] Add GROUP BY ALL”, I traced the code and found a suspicious difference. In theory, GROUP BY ALL should behave the same as spelling out all inferred grouping expressions, but I found that they go through different code paths.
After entering transformGroupClause(), if this is GROUP BY ALL, it immediately enters a separate branch, loops over the target list, calls addTargetToGroupList() for every TLE, and then returns.
For non-ALL, it calls transformGroupClauseExpr() for every group clause. Inside transformGroupClauseExpr(), there is logic that the ALL path misses:
```
/*
* If the GROUP BY tlist entry also appears in ORDER BY, copy operator
* info from the (first) matching ORDER BY item. This means that if
* you write something like "GROUP BY foo ORDER BY foo USING <<<", the
* GROUP BY operation silently takes on the equality semantics implied
* by the ORDER BY. There are two reasons to do this: it improves the
* odds that we can implement both GROUP BY and ORDER BY with a single
* sort step, and it allows the user to choose the equality semantics
* used by GROUP BY, should she be working with a datatype that has
* more than one equality operator.
*
* If we're in a grouping set, though, we force our requested ordering
* to be NULLS LAST, because if we have any hope of using a sorted agg
* for the job, we're going to be tacking on generated NULL values
* after the corresponding groups. If the user demands nulls first,
* another sort step is going to be inevitable, but that's the
* planner's problem.
*/
foreach(sl, sortClause)
{
SortGroupClause *sc = (SortGroupClause *) lfirst(sl);
if (sc->tleSortGroupRef == tle->ressortgroupref)
{
SortGroupClause *grpc = copyObject(sc);
if (!toplevel)
grpc->nulls_first = false;
*flatresult = lappend(*flatresult, grpc);
found = true;
break;
}
}
```
Based on this finding, I built a test using record_image_ops, where row(1.0) and row(1.00) compare as distinct under record-image equality:
```
evantest=# CREATE TYPE t_rec AS (x numeric);
CREATE TYPE
evantest=#
evantest=# SELECT row(1.0)::t_rec OPERATOR(pg_catalog.*=) row(1.00)::t_rec;
?column?
----------
f
(1 row)
```
Here is the repro:
```
evantest=# create table t (a t_rec);
CREATE TABLE
evantest=# set enable_hashagg = 0;
SET
evantest=# insert into t values(row(1.0)::t_rec), (row(1.00)::t_rec);
INSERT 0 2
evantest=# select a, count(a) from t group by a order by a using operator(pg_catalog.*<);
a | count
--------+-------
(1.00) | 1
(1.0) | 1
(2 rows)
evantest=# select a, count(a) from t group by all order by a using operator(pg_catalog.*<);
a | count
-------+-------
(1.0) | 2
(1 row)
```
As we can see, "GROUP BY a" distinguishes the two rows because it uses the equality semantics implied by ORDER BY ... USING, but "GROUP BY ALL" groups the two rows together because it uses the default grouping semantics instead.
The fix mostly refactors the existing logic so the GROUP BY ALL path also handles the ORDER BY sort clause. See the attached patch for details.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Paul A Jungwirth | 2026-06-29 07:38:34 | Re: [PATCH] Fix null pointer dereference in PG19 |
| Previous Message | Bertrand Drouvot | 2026-06-29 07:06:36 | Re: [PATCH] Change wait_time column of pg_stat_lock to double precision |