Re: POC: GROUP BY optimization

From: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Richard Guo <guofenglinux(at)gmail(dot)com>, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, David Rowley <dgrowleyml(at)gmail(dot)com>, "a(dot)rybakina" <a(dot)rybakina(at)postgrespro(dot)ru>
Subject: Re: POC: GROUP BY optimization
Date: 2024-04-18 08:54:45
Message-ID: db0fc3a4-966c-4cec-a136-94024d39212d@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4/12/24 06:44, Tom Lane wrote:
> * Speaking of pathnodes.h, PathKeyInfo is a pretty uninformative node
> type name, and the comments provided for it are not going to educate
> anybody. What is the "association" between the pathkeys and clauses?
> I guess the clauses are supposed to be SortGroupClauses, but not all
> pathkeys match up to SortGroupClauses, so what then? This is very
> underspecified, and fuzzy thinking about data structures tends to lead
> to bugs.
I'm not the best in English documentation and naming style. So, feel
free to check my version.
>
> So I'm quite afraid that there are still bugs lurking here.
> What's more, given that the plans this patch makes apparently
> seldom win when you don't put a thumb on the scales, it might
> take a very long time to isolate those bugs. If the patch
> produces a faulty plan (e.g. not sorted properly) we'd never
> notice if that plan isn't chosen, and even if it is chosen
> it probably wouldn't produce anything as obviously wrong as
> a crash.
I added checkings on the proper order of pathkeys and clauses.
If you really care about that, we should spend additional time and
rewrite the code to generate an order of clauses somewhere in the plan
creation phase. For example, during the create_group_plan call, we could
use the processed_groupClause list, cycle through subpath->pathkeys, set
the order, and detect whether the pathkeys list corresponds to the
group-by or is enough to build a grouping plan.
Anyway, make this part of code more resistant to blunders is another story.

--
regards,
Andrei Lepikhov
Postgres Professional

Attachment Content-Type Size
final_improvements.patch text/x-patch 7.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-04-18 08:59:33 Re: plenty code is confused about function level static
Previous Message Christoph Berg 2024-04-18 08:54:18 Re: pgsql: meson: Add initial version of meson based build system