Re: POC: GROUP BY optimization

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, David Rowley <dgrowleyml(at)gmail(dot)com>, "a(dot)rybakina" <a(dot)rybakina(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: POC: GROUP BY optimization
Date: 2024-01-09 09:45:27
Message-ID: CALDaNm2CP2XQGTdNYKHnETeQZUV_TUBM-S297-298VqiwrFrLg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 9 Jan 2024 at 14:31, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
>
> Here is a new version of GROUP-BY optimization without sort model.
>
> On 21/12/2023 17:53, Alexander Korotkov wrote:
> > I'd like to make some notes.
> >
> > 1) As already mentioned, there is clearly a repetitive pattern for the
> > code following after get_useful_group_keys_orderings() calls. I think
> > it would be good to extract it into a separate function. Please, do
> > this as a separate patch coming before the group-by patch. That would
> > simplify the review.
> Done. See patch 0001-*. Unfortunately, extraction of whole cycle isn't
> practical, because it blows out the interface of the routine.
>
> > 2) I wonder what planning overhead this patch could introduce? Could
> > you try to measure the worst case? What if we have a table with a lot
> > of indexes and a long list of group-by clauses partially patching
> > every index. This should give us an understanding on whether we need
> > a separate GUC to control this feature.
> In current implementation I don't anticipate any significant overhead.
> GUC is needed here to allow users adhere their own ordering and to
> disable feature in the case of problems.
>
> > 4) I think we can do some optimizations when enable_incremental_sort
> > == off. Then in get_useful_group_keys_orderings() we should only deal
> > with input_path fully matching the group-by clause, and try only full
> > match of group-by output to the required order.
> Hm, is it really make sense in current implementation?

CFBot shows the following errors at [1] with:
[08:33:28.813] ../src/backend/utils/adt/selfuncs.c: In function
‘estimate_num_groups’:
[08:33:28.813] ../src/backend/utils/adt/selfuncs.c:3389:9: warning:
implicit declaration of function ‘estimate_num_groups_incremental’
[-Wimplicit-function-declaration]
[08:33:28.813] 3389 | return estimate_num_groups_incremental(root, groupExprs,
[08:33:28.813] | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[08:33:28.813] ../src/backend/utils/adt/selfuncs.c: At top level:
[08:33:28.813] ../src/backend/utils/adt/selfuncs.c:3400:1: warning: no
previous prototype for ‘estimate_num_groups_incremental’
[-Wmissing-prototypes]
[08:33:28.813] 3400 | estimate_num_groups_incremental(PlannerInfo
*root, List *groupExprs,
[08:33:28.813] | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[08:33:28.813] ../src/backend/utils/adt/selfuncs.c:3400:1: error:
conflicting types for ‘estimate_num_groups_incremental’
[08:33:28.813] ../src/backend/utils/adt/selfuncs.c:3389:9: note:
previous implicit declaration of ‘estimate_num_groups_incremental’ was
here
[08:33:28.813] 3389 | return estimate_num_groups_incremental(root, groupExprs,

[1] - https://cirrus-ci.com/task/5074942069309440

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-01-09 10:08:53 Re: Random pg_upgrade test failure on drongo
Previous Message Christoph Berg 2024-01-09 09:43:53 Re: psql JSON output format