Quick Links

Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Richard Guo <guofenglinux(at)gmail(dot)com>
Cc:	David Rowley <dgrowleyml(at)gmail(dot)com>, 邱宇航 <iamqyh(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master
Date:	2025-10-17 18:41:06
Message-ID:	1722969.1760726466@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Richard Guo <guofenglinux(at)gmail(dot)com> writes:
> Having heard nothing but crickets and not wanting to leave this until
> the 11th hour before November, I'll plan to push the v1 patch next
> week, unless there are any objections.

OK, I finally made some time to think this through, and here's
my thoughts:

The definition of HAVING is that it excludes grouped rows that don't
satisfy the HAVING condition. The idea behind the push-down-HAVING
optimization is that if we exclude all source rows that don't satisfy
the condition, then no groups that don't satisfy it will be formed, so
we don't have to re-check the HAVING condition after forming groups.

Grouping sets break this concept, which is why the pre-67a54b9e
planner code just punted on the optimization if there were grouping
sets. There are two reasons why they cause trouble:

1. If a variable doesn't appear in every grouping set, then it will
sometimes go to NULL during grouping, meaning that the HAVING clause
needs to test a different value than what is available at the scan
level. We did not have a good way to deal with that pre-v18, but now
we can check to see whether the clause contains any variables marked
as nullable by group_rtindex; if not, it should be safe to push down.

2. If there is an empty grouping set, it will give rise to a grouped
row whether there is any input or not. In this case we *must* keep
the HAVING condition at the top level, else it fails to filter that
row. (We might still choose to copy it to WHERE, however.) Note
that whether the HAVING condition is degenerate (variable-free) or
not does not enter into this directly; although if it has any Vars
they are certainly all nullable by group_rtindex, meaning that
checking for #1 accidentally keeps us out of trouble unless the
condition is degenerate.

67a54b9e accounted for #1 but not #2, which is why we have a bug.

The proposed patch tries to close the hole by checking whether
the condition is degenerate, but that feels subtly wrong to me:
what we ought to check is whether there is any empty grouping set.
As proposed, I think we miss optimization opportunities for
degenerate HAVING because we will not try the trick of copying
it to WHERE.

Also, I suspect that this change in 67a54b9e was wrong in detail:

newHaving = lappend(newHaving, havingclause);
}
- else if (parse->groupClause && !parse->groupingSets)
+ else if (parse->groupClause)
{
Node *whereclause;

because it allows us to apply "move to WHERE" rather than "copy to
WHERE" in cases with grouping sets, and that can't always be right.
I think that if that change hadn't been made, we would not have
had a bug. So the fact that the proposed fix doesn't touch this
bit does not seem right.

regards, tom lane

In response to

Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master at 2025-10-16 09:13:35 from Richard Guo

Responses

Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master at 2025-10-17 21:14:02 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2025-10-17 18:45:12	Re: abi-compliance-check failure due to recent changes to pg_{clear,restore}_{attribute,relation}_stats()
Previous Message	Masahiko Sawada	2025-10-17 18:31:19	Re: Unused stricture field in xlogreader:DecodedBkpBlock