Re: Wrong results with grouping sets

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Wrong results with grouping sets
Date: 2026-02-24 08:10:23
Message-ID: CAMbWs4_rs8r4D+j2rPT-MX4y6DOBcP3c8tGPA9MJBYkZPPtD2g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 23, 2026 at 11:48 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I happened to notice ffe12d1d2 after David mentioned
> simplify_EXISTS_query in another thread, and nearly fell off my chair
> when I read this bit:
>
> query->rtable = list_delete_cell(query->rtable, lc);
>
> How can that possibly be safe? It will change the rangetable index of
> every following RTE. It might appear to work as long as the RTE_GROUP
> RTE is always last. But I don't think you can rely on that, or should
> rely on it even if it does happen to still be always true even after
> query rewrite and other early-planning manipulations.

Ugh. I'm not sure what I was thinking there. Deleting an RTE from
the middle of the rtable is definitely taboo, as it shifts the indexes
and can silently corrupt any Var nodes in the query tree that
reference those later RTEs. Also, relying on the RTE_GROUP happening
to be the last entry in the list is extremely fragile and just trouble
waiting to happen.

> A safer way might be to convert the RTE into an unreferenced
> RTE_RESULT, or some other innocuous RTE type.

It seems our convention is to convert the RTE into an RTE_RESULT RTE
in-place (eg, as in pull_up_constant_function). A side effect of this
approach is that it might force the parent query to run
remove_useless_result_rtes when it otherwise wouldn't need to. It
seems to me that this is acceptable, because 1) this edge case only
happens when an EXISTS subquery contains GROUP BY clauses and in the
meanwhile is successfully simplified; 2) the overhead of walking the
jointree in remove_useless_result_rtes is minimal; 3) in some cases,
running it actually helps elide single-child FromExprs, such as in
this query:

select * from t1 where exists (select 1 from t2 where a = t1.a group by a);

... although this only has micro-efficiency value.

Attached is the patch that converts the RTE into an RTE_RESULT.

- Richard

Attachment Content-Type Size
v1-0001-Fix-unsafe-RTE_GROUP-removal-in-simplify_EXISTS_q.patch application/octet-stream 3.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ilia Evdokimov 2026-02-24 08:15:54 Re: Reduce planning time for large NOT IN lists containing NULL
Previous Message Michael Paquier 2026-02-24 08:01:26 Re: change default default_toast_compression to lz4?