Re: POC: GROUP BY optimization

From: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Richard Guo <guofenglinux(at)gmail(dot)com>, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, David Rowley <dgrowleyml(at)gmail(dot)com>, "a(dot)rybakina" <a(dot)rybakina(at)postgrespro(dot)ru>
Subject: Re: POC: GROUP BY optimization
Date: 2024-04-17 03:11:59
Message-ID: a663f0f6-cbf6-49aa-af2e-234dc6768a07@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4/12/24 06:44, Tom Lane wrote:
> * I'm pretty unconvinced by group_keys_reorder_by_pathkeys (which
> I notice has already had one band-aid added to it since commit).
> In particular, it seems to believe that the pathkeys and clauses
> lists match one-for-one, but I seriously doubt that that invariant
> remains guaranteed after the cleanup steps
>
> /* append the remaining group pathkeys (will be treated as not sorted) */
> *group_pathkeys = list_concat_unique_ptr(new_group_pathkeys,
> *group_pathkeys);
> *group_clauses = list_concat_unique_ptr(new_group_clauses,
> *group_clauses);
>
> For that to be reliable, the SortGroupClauses added to
> new_group_clauses in the main loop have to be exactly those
> that are associated with the same pathkeys in the old lists.
> I doubt that that's necessarily true in the presence of redundant
> grouping clauses. (Maybe we can't get here with any redundant
> grouping clauses, but still, we don't really guarantee uniqueness of
> SortGroupClauses, much less that they are never copied which is what
> you need if you want to believe that pointer equality is sufficient
> for de-duping here. PathKeys are explicitly made to be safe to compare
> pointer-wise, but I know of no such guarantee for SortGroupClauses.)
I spent a lot of time inventing situations with SortGroupClause
duplicates. Unfortunately, it looks impossible so far. But because we
really don't guarantee uniqueness, I changed the code to survive in this
case. Also, I added assertion checking to be sure we don't have logical
mistakes here - see attachment.
About the band-aid mentioned above - as I see, 4169850 introduces the
same trick in planner.c. So, it looks like result of design of the
current code.

--
regards,
Andrei Lepikhov
Postgres Professional

Attachment Content-Type Size
get_useful_group_keys_orderings.patch text/x-patch 3.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2024-04-17 04:21:23 Solaris tar issues, or other reason why margay fails 010_pg_basebackup?
Previous Message Nathan Bossart 2024-04-17 02:36:09 Re: An improved README experience for PostgreSQL