Re: Wrong results with grouping sets

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Wrong results with grouping sets
Date: 2024-01-06 20:59:36
Message-ID: 1353403.1704574776@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Richard Guo <guofenglinux(at)gmail(dot)com> writes:
> For a variable-free expression, if it contains volatile functions, SRFs,
> aggregates, or window functions, it would not be treated as a member of
> EC that is redundant (see get_eclass_for_sort_expr()). That means it
> would not be removed from the pathkeys list, so we do not need to set
> the nullingrels for it. Otherwise we can just wrap the expression in a
> PlaceHolderVar. Attached is an updated patch to do that.

I don't think this is going in quite the right direction. We have
many serious problems with grouping sets (latest one today at [1]),
and I don't believe that hacking around EquivalenceClasses is going
to fix them all.

I think that what we really need to do is invent a new kind of RTE
representing the output of the grouping step, with columns that
are the Vars or expressions being grouped on. Then we would make
the parser actually replace subexpressions in the tlist with Vars
referencing this new RTE (that is, change check_ungrouped_columns
into something that modifies the expression tree into something that
contains no Vars that aren't grouping-RTE Vars). In this way the
output of the parser directly expresses the semantic requirement that
certain subexpressions be gotten from the grouping output rather than
computed some other way.

The trick is to do this without losing optimization capability.
We could have the planner replace these Vars with the underlying
Vars in cases where it's safe to do so (perhaps after adding a
nullingrel bit that references the grouping RTE). If a grouping
column is an expression, we might be able to replace the reference
Vars with PHVs as you've done here ... but I think we need the
parser infrastructure fixed first.

regards, tom lane

[1] https://www.postgresql.org/message-id/flat/CAEzk6fcgXWabEG%2BRFDaG6tDmFX6g1h7LPGUdrX85Pb0XB3B76g%40mail.gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-01-06 21:08:57 Re: Fix bogus Asserts in calc_non_nestloop_required_outer
Previous Message Tom Lane 2024-01-06 20:10:59 Re: Password leakage avoidance