Re: Group by reordering optimization

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: Group by reordering optimization
Date: 2020-09-01 21:08:54
Message-ID: 20200901210743.lutgvnfzntvhuban@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 01, 2020 at 01:15:31PM +0200, Dmitry Dolgov wrote:
>Hi,
>
>Better late than never, to follow up on the original thread [1] I would like to
>continue the discussion with the another version of the patch for group by
>reordering optimization. To remind, it's about reordering of group by clauses
>to do sorting more efficiently. The patch is rebased and modified to address
>(at least partially) the suggestions about making it consider new additional
>paths instead of changing original ones. It is still pretty much
>proof-of-concept version though with many blind spots, but I wanted to start
>kicking it and post at least something, otherwise it will never happen. An
>incremental approach so to say.
>
>In many ways it still contains the original code from Teodor. Changes and notes:
>
>* Instead of changing the order directly, now patch creates another patch with
> modifier order of clauses. It does so for the normal sort as well as for
> incremental sort. The whole thing is done in two steps: first it finds a
> potentially better ordering taking into account number of groups, widths and
> comparison costs; afterwards this information is used to produce a cost
> estimation. This is implemented via a separate create_reordered_sort_path to
> not introduce too many changes, I couldn't find any better place.
>

I haven't tested the patch with any queries, but I agree this seems like
the right approach in general.

I'm a bit worried about how complex the code in planner.c is getting -
the incremental sort patch already made it a bit too complex, and this
is just another step in that direction. I suppose we should refactor
add_paths_to_grouping_rel() by breaking it into smaller / more readable
pieces ...

>* Function get_func_cost was removed at some point, but unfortunately this
> patch was implemented before that, so it's still present there.
>
>* For simplicity I've removed support in create_partial_grouping_paths, since
> they were not covered by the existing tests anyway.
>

Hmmm, OK. I think that's something we'll need to address for the final
patch, but I agree we can add it after improving the costing etc.

>* The costing part is pretty rudimentary and looks only at the first group.
> It's mostly hand crafted to pass the existing tests.
>

OK, understood.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2020-09-01 21:15:20 Re: Maximum password length
Previous Message Alvaro Herrera 2020-09-01 21:08:25 Re: v13: show extended stats target in \d